CORRECTED 
VERSION 



rrED I 

3N» 1 

r PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
Intenutianti Bureau 




INTERNATIONAL .APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



[ (51) International Patent Classification 0 : 

! C07D 221/02, 215/12, C12Q 1/70, 1/68, 
I 1/08, C12P 21/06, C12N 15/00, 9/14, 
9/84, 9/86. G01N 33/53, C07H 21/02, 
I 21/04, A01N 43/04 



Al 



(11) International Publication Number 
(43) International Publication Date: 



WO 98A3353 

2 April 1998 (0X04.98) 



(21) International Application Number: 



i 



22) International Filing Date: 26 September 1997 (26.09.97) 
26 Septemoer 1996 (26.09.96) US 



PCI7US97/17395 i (81) Designated States: AL. AM. AT, AU, AZ, BA. BB. BG, BR, 1 
BY, CA, CH, CN. CU. CZ. DE, DK, EE, ES, FI, GB, GE 



(30) Priority Data: 

08/719.697 



(63) Related by Continuation (CON) or Continuation -in -Part 
(CIP) to Earlier Application 

US 08/719.697 (CIP) 

Filed on 26 September 1996 (26.09.96) 

(71) Applicant {for all designated States except US}: AURORA 

BIOSCIENCES CORPORATION [US/US); 1 11 49 N. Tor- 
rey Pines Road, La Jolla, CA 92037 (US). 

(72) Inventors: and 

(75) Inventors/Applicants (for US only/: WHFTNEY. Michael. A. 
[US/US], 8320 Via Sonoma #81. La Jolla, CA 92037 
(US). NEGULESCU. Paul. A. fUS/USl. 678 Soiana Circle. 
Solana Beach. CA 92075 (US). CRAIG. Frank (US/US]; 
409 Santa Helena. Solana Beach. CA 92075 (US). MERE, 
Lora (US/USl; 3522 Syracuse Avenue. San Diego. CA 
92122 (US). FOULKES. Gorden, J. (US/USl; 1220 Rancno 
Encimias Drive. Encinitas. CA 92024 (US). 

(74) Agent: HA1LE, Lisa. A.; Fish & Richardson P.C.. Suite 1400. 
4225 Executive Square, La Jolla. CA 92037 (US). 



GH, KU. |L. IS, JP. KE, KG. KP. KR, KZ, LC UC LR, 
LS, LT. LU. LV. MD. MG. MK. MN. MW, MX. NO, NZ, 
PL, PT, RO, RU, SD. SE, SC. SI SK. SU TJ. TM, TR, 
TT. UA. UG. US. UZ. VN. YU. ZW, ARIPO paiett (GH, 
KE, LS, MW, SD, SZ, UG, ZW). Eurasian patent (AM, AZ, 
BY. KG, KZ. MD, RU. TJ. TM). European patent (AT, BE, 
CH, DE. DK, ES, FL FR. GB. GR. IE. IT. LU. MC, NL, 
PT. SE), OAPI patent (BF. BJ, CF, CG. CI CM, GA, GN. 
ML, MR, NE, SN. TD. TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: METHODS AND COMPOSITIONS FOR SENSITIVE AND RAPID, FUNCTIONAL IDENTIFICATION OF GENOMIC 
POLYNUCLEOTIDES AND USE FOR CELLULAR ASSAYS CN DRUG DISCOVERY 



(57) Abstract 



The invention provides for methods and compositions for identifying proteins or chemicals that directly or indirectly modulate a 
genomic polynucleotide and methods for identifying active genomic polynucleotides. Generally, the method comprises inserting a BL 
(.beta-lnctamaso expression construct into an eukaryotic genome, usually non-yeast, contained in at least one living cell, contacting the 
cell witn a predetermined concentration of a modulator, and detecting BL activiry in the cell. 

i 



•(Referred to in PCT Gazette No 38/1998. Section II) 



Applicant: Virginia W. Cornish 
U.S. Serial No.: 09/768,479 
Filed: January 24, 2001 
Group Art Unit: 1614 

KVKihit 7 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


E5 


Spam 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Fuland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Larvu 


sz 


Swaziland 


KZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


rc 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Mid ag as car 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MS 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Tmidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UC 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United State* of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbebat as 


CF 


Central African Republic 


J? 


Japan 


rVE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzttan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cote d'l voire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Repoblic of Korea 


FT 








CU 


Cuba 


KZ 


Kuaksun 


RO 


Romania 






CZ 


Crech Republic 


LC 


Sato* Lucia 


RU 


Russian Federal toe 






DE 


Germany 


U 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Ltotca 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 



INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



| (51) International Patent Classification 6 : 

i C07D 221/02, 215/12, C12Q 1/70, 1/68, 
1/08, C12P 21/06, C12N 15AK), 9/14, 
9/84, 9/86, G01N 33/53, C07H 21/02, 
21/04, A01N 43/04 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



WO 98/13353 

2 April 199H (02.04.98) 



(21) 
(22) 

(30) 
(71) 



International Application Number: PCT/US97/I7395 
International Filing Date: 26 September 1997 (26.09.97) 



Priority Data: 

08/719,697 



26 September 1996 (26.09.96) US 



Applicant (for all designated States except US): AURORA 
BIOSCIENCES CORPORATION JL'S/USJ; 1 1 149 N. Tor- 
rev Pines Road. 1 J Jolla. CA 92037 (US). 



(72) Inventors; and 

(75) Inventors/Applicants (for US only): WHITNEY, Michael. A. 
[US/US I; 8320 Via Sonoma #81. La Jolla. CA 9203? 
(US). NEGULESCU. Paul, A |US/US). 678 Solaria Circle, 
Sobna Beach, CA 92075 (US). CRAIG, Frank |US/US|; 
409 Santa Helena, Solana Beach. CA 92075 (US). MERE, 
Lorn | US/US]; 3522 Syracuse Avenue. San Diego, CA 
92122 (US). FOULKKS. Gofden, J. | US/US J; 1220 Rancho 
lincinitas Drive, Enciniias, CA 92024 (US). 

(74> Agent: HAILE, Lisa. A.; Fish & Richardson P.C.. Suite 1400, 
4225 Executive Square. U JotJa. CA 92037 (US). 



(81) Designated Stales: AL. AM, AT, AU, AZ. BA. BB, BG. BR, 

BY, CA, CH. CN, CU. CZ. DE, DK, EE. ES, F). GB. GE, 
GH. HU, IL, IS. JP. KE. KG. KP, KR. KZ, LC. LK. LR. 
LS. LT. LU. LV, MD. MG, MK, MN. MW, MX, NO. NZ, 
PL, PT. RO, RU. SD. SE. SG. SI, SK, SL. TJ. TM. TR, 
TT. UA. UG. US, UZ, VN. YU, ZW, ARIPO patent (GH. 
KE, LS, MW, SD, SZ. UG. ZW), Eurasian patent (AM, AZ. 
BY, KG, KZ, MD, RL*. TJ. TM). European patent (AT. BE, 
CH. DE. DK, ES, Fi, FR. GB. GR, IE, IT, LL\ MC. NL. 
PT. SE), OAP1 patent (BF, BJ. CF, CG, CI. CM. GA. GN, 
ML. MR, NE. SN, TD. TG). 



Published 

With international search report. | 
Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of . 
amendments. 



(54) Title: 



METHODS AND COMPOSITIONS FOR SENSITIVE AND RAPID, FUNCTIONAL IDENTIFICATION OF GENOMIC 
POLYNUCLEOTIDES AND USE FOR CELLULAR ASSAYS IN DRUG DISCOVERY 



(57) Abstract 

The invention provides for methods and compositions for identifying proteins or chemicals that directly or indirectly modulate a 
genomic polynucleotide and methods for identifying active genomic poly nucleotides Generally, the method comprises inserting a BL 
(beta-lactamase) expression construct into an eukaryotic genome, usually non-yeast, contained in at least one living cell, contacting the 
cell wim a predetermined concentration of a modulator, and delecting BL activity in the cell. 



FOR THE PURPOSES Oh INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



Al. 


Albania 


ES 


Spam 


\S 


Lew* ho 


SI 


Skrvcnu 


AM 


Atmoiil 


Kl 


He land 


IT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


PR 


rrawee 


LU 


iMtembourg 


SN 


Seoejtal 


AU 


Australia 


CA 


Oabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


CB 


United Kingdom 


MC 


Monaco 


TI) 


Chad 


BA 


Bosnia ind Herzegovina 


OK 


Georf u 


MD 


Republic of Moldova 


TC 


"1 ojo 


DD 


Owbadot 


CH 


Ghana 


MC 


Madapaicar 


TJ 


Tajik ai an 


BE 




CN 


f>U»«f* 


MK 


The former YujojUv 


TM 


Turtmenuian 


BK 


Bttrama F»n 


CK 


Greece 




Republic of Macedonia 


TK 


1 «Ttey 


BG 


Bulgaria 


HU 


Hunjar> 


ML 


Mali 


TT 


innidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Movftolu 


UA 


Ukraine 


BK 


Dim) 


11. 


Israel 


MR 


Mauritania 


UC 


U|anda 


BY 


Belarus 


IS 


fexland 


MVV 


Malaw* 


US 


Unite*) Ittsiei of Amotca 


CA 


Canada 


IT 


haly 


MX 


Mexico 


UZ 


Uibckman 


Cr 


Central African Republic 


JP 


Japan 


N* 


N'gcr 


VN 


Viet ham 


cc 


Congo 


kt: 


Kenya 


NL 


Netherlands 


Yli 


Yu jot la vii 


CH 


Switzerland 


KC 


KyTtyittan 


NO 


Norway 


7M 


Zimbabwe 


a 


C6te <J'lvoirr 


kl* 


Dcmocraiif Heople'i 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


t*oUnd 






CN 


China 


KR 


Republic or Korea 


PT 


PonugaJ 






CL 


Cuba 


KZ 


K az*k$ien 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Ruuian Fcdetafun 






r>F 


Germany 


U 


I *rt#rn*tein 


SD 


Sndan 






VK 


Den mart 


I.K 


Sn Lanka 


SE 


Sweden 






EE 




Ut 


Libera 


SC. 


Singapore 







WO 98/13353 



PCT/US97/17395 



Methods and Compositions for Sensitive and Rapid, Functional 
Identification of Genomic Polynucleotides 
and Use For Cellular Assays in Drug Discovery 

5 

Cross Reference to Related Applications 

This application claims the benefit of an earlier filing date to a patent application 
identified as United States patent application No.: 08/719,697, filed September 26, 1996, 
10 entitled "Methods and compositions for sensitive and rapid, functional identification of 
genomic polynucleotides and secondary screening capabilities" of which the present 
application as a Continuation-in-Part and which is incorporated herein by reference. 

Technical Field 

15 The present invention generally relates to methods and compositions for the 

identification of useful and functional portions of the genome and compounds for 
modulating such portions of the genome, particularly the identification of proteins that 
are directly or indirectly transcriptionally regulated and compounds for regulating such 
proteins, either directly or indirectly. 

20 

Background 

The identification and isolation of useful portions of the genome requires 
extensive expenditure of time and financial resources. Currently, many genome projects 
use various strategies to reduce cloning and sequencing times. While genome projects 

25 rapidly expand the database of genetic material, such projects often lack the ability to 
integrate the information with the biology of the cell or organism from which the genes 
were isolated. In some instances, coding regions of newly isolated genes reveal sequence 
homology to other genes of known function. This type of analysis can, at best, provide 
clues as to the possible relationships between different genes and proteins. Genomic 

30 projects in general, however, suffer from the inability to rapidly and directly isolate, and 
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identify specific, yet unknown, genes associated with particular a biological process or 
processes. 

The evaluation of the function of genes identified from genomic sequencing 
projects requires cloning the discovered gene into an expression system suitable for 

5 functional screening. Transferring the discovered gene into a functional screening system 
requires additional expenditure of time and resources without a guarantee that the correct 
screening system was chosen. Since the function of the discovered gene is often 
unknown or only surmised by inference to structurally related genes, the chosen 
screening system may not have any relationship to the biological function of the gene. 

10 For example a gene may encode a protein that is structurally homologous to the beta- 
adrenergic receptor and have a dissimilar function. Further, if negative results are 
obtained in the screen, it can not be easily determined whether 1) the gene or gene 
product is not functioning properly in the screening assay or 2) the gene or gene product 
is directly or indirectly involved in the biological process being assayed by the screening 

15 system. 

Consequently, there is a need to provide methods and compositions for rapidly 
isolating portions of genomes associated with a known biological process and to screen 
such portions of genomes for activity without the necessity of transferring the gene of 
interest into an additional screening system, 

20 

Brief Description of the Figures 

FIG. 1 shows a comparison between an application of a prior art reporter gene 
with methods described herein, and one embodiment of the invention. The prior art uses 
the b-gal reporter and requires the establishment of clones pnor to expression analysis. 
25 One embodiment of this invention allows for the rapid identification of living cell clones 
from large multiclonal populations of BLEC (beta-lactamase expression construct) 
integrated cells. This is a significant advancement over the prior art, which requires the 
analysis of individual clones followed by the retrieving of selected clone from a duplicate 
clonal stock of living cells. 
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FIG. 2 shows a representation of how one embodiment of the invention reports 
the expression of a pathway within a cell and can be used for screening. 

FIG. 3 shows a schematic plasmid map of the BLEC-1. 

FIG. 4 shows the FACS analysis of a population of genomically BLEC 
5 integrated clones. Individually cells are plotted by fluorescent emission properties at 400 
nm excitation. The x axis represents green emission (530 run). The y axis represents blue 
emission (465 nm). Cells with a high blue/green ration will appear blue in color and cells 
with a low blue/green ratio will appear green in color. A) Unselected multiclonal 
population of BLEC integrated RBL-1 cell clones. B) Population of clones sorted from 
10 3 A (Rl) that were cultured for an additional 7 days and resorted. C) Population from 3B 
with addition of luM lonomycin for 12 hours prior to sorting. 



Summary 

The present invention recognizes that P-lactamase polynucleotides can be 
15 effectively used in living eukaryouc cells to functionally identify active portions of a 

genome directly or indirectly associated with a biological process. The present invention 
also recognizes for the first time that (J-lactamase activity can be measured using 
membrane permeant substrates in living cells incubated with a test chemical that directly 
or indirectly interacts with a portion of the genome having an integrated p-lactamase 
20 polynucleotide. The present invention, thus, permits the rapid identification and isolation 
of genomic polynucleotides indirectly or directly associated with a defined biological 
process and identification of compounds that modulate such processes and regions of the 
genome. Because the identification of active genomic polynucleotides is permitted in 
living cells, further functional characterization can be conducted using the same cells, and 
25 optionally, the same screening assay. The ability to functionally screen immediately after 
the rapid identification of a functionally active portion of a genome, without the necessity 
of transferring the identified portion of the genome into a secondary screening system, 
represents, among other things, a distinct advantage over an application of a prior art 
reporter gene with the methods described herein, as shown in FIG. 1. 
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The invention provides for a method of identifying portions of a genome, e.g. 
genomic polynucleotides, in a living cell using a polynucleotide encoding a protein with 
P-lactamase activity that can be detected with a membrane permcant p-lactamase 
substrate. Typically, the method involves inserting a polynucleotide encoding a protein 

5 with (^-lactamase activity into the genome of an organism using any method known in the 
art, developed in the future or described herein. Usually, a p-lactamase expression 
construct will be used into integrate a p-lactamase polynucleotide into a eukaryotic 
genome, as described herein. The cell, such as a eukaryotic cell, is usually contacted with 
a predetermined concentrauon of a modulator, either before or after integration of the P- 

io lactamase polynucleotide p-lactamase activity is usually then measured inside the living 
cell, preferably with fluorescent, membrane permeant P-lactamase substrates that are 
transformed by the cell into membrane impermeant P-lactamase substrates as described 
herein. 

The invention also provides for a method of identifying proteins or compounds 
15 that directly or indirectly modulate a genomic polynucleotide. Generally, the method 
comprises inserting a P-lactamase expression construct into an eukaryotic genome, 
usually non-yeast, contained in at least one living cell, contacting the cell with a 
predetermined concentration of a modulator, and detecting P-lactamase activity in the 
cell. 

20 The invention also provides for a method of screening compounds with an active 

genomic polynucleotide that comprises: 1) optionally contacting a multiclona! population 
of cells with a first test chemical prior to separating said cells by a FACS, 2) separating 
by a FACS said multiclonal population of cells into P-lactamase expressing cells and non 
p-lactamase expressing cells, wherein said P-lactamase expressing cells have a detectable 

25 difference in cellular fluorescence properties compared to non -p-lactamase expressing 
cells, 3) contacting either population of cells with the same or a different test chemical, 
and 4) optionally repeating step (2), wherein said multi-cional population of cells 
comprises eukaryotic cells having a p-lactamase expression construct integrated into a 
genome of said eukaryotic cells and a membrane permanent P-lactamase substrate 
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transformed inside said cells to a membrane impermeant lactamase substrate. The 
steps of this method can be repeated to permit additional characterization of identified 
clones. 

The invention also includes powerful methods and compositions for identifying 
physiologically relevant cellular pathways and proteins of interest of known, unknown or 
partially known function. As shown in FIG. 2 a pathway may have more than one major 
intracellular signal. Two major intracellular pathways are shown ("A" and "B"). Each 
intracellular signal pathway may also have multiple branches. Each arm is shown as 
having three signaling pathways (Al, A2, and A3; and Bl, B2, and B3). By generating a 
library of clones with a -lactamase expression construct, genomic polynucleotides for 
each signal pathway can be tagged or reported by the expression of -lactamase. 
Pathways not effected by the modulator (shown as CI, C2, and C3) are also tagged with 
-lactamase expression construct. Because the modulator only modulates the expression 
of pathways Al, A2, A3, Bl, B2, and B3, only clones corresponding to these genomic 
integration sites are identified as being responsive to the modulator. Clones 
corresponding to sites CI, C2, and C3 remain unaltered and are not responsive to the 
modulator. Any individual, modulated clone can be immediately isolated, if not already 
isolated, and used for a drug discovery assay to screen test chemicals for activity for 
modulating the reported pathway, as described herein. 

The invention also includes tools for pathway identification and drug discovery 
that can be applied to a number of targets of interest and therapeutic areas including, 
proteins of interest, physiological responses even in the absence of a definitive target (e.g. 
immune response, signal transduction, neuronal function and endocrine function), viral 
targets, and orphan proteins. 
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Detailed Description of the Invention 

Definitions 

Unless defined otherwise, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the an to which this 

5 invention belongs. Generally, the nomenclature used herein and the laboratory 
procedures in cell culture, molecular genetics, and nucleic acid chemistry and 
hybridization described below are those well known and commonly employed in the art. 
Standard techniques are used for recombinant nucleic acid methods, polynucleotide 
synthesis, and microbial culture and transformation (e.g., electroporation, lipofection). 

10 Generally, enzymatic reactions and purification steps are performed according to the 
manufacturer's specifications. The techniques and procedures are generally performed 
according to conventional methods in the an and various general references (see 
generally, Sambrook et a). Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., which is incorporated herein 

15 by reference) which are provided throughout this document. The nomenclature used 

herein and the laboratory procedures in analytical chemistry, organic synthetic chemistry, 
and pharmaceutical formulation described below are those well known and commonly 
employed in the art. Standard techniques are used for chemical syntheses, chemical 
analyses, pharmaceutical formulation and delivery, and treatment of patients. As 

20 employed throughout the disclosure, the following terms, unless otherwise indicated, 
shall be understood to have the following meanings: 

"Fluorescent donor moiety" refers to a fluorogenic compound or pan of a 
compound (including a radical) which can absorb energy and is capable of transferring 
the energy to another fluorogenic molecule or part of a compound. Suitable donor 

25 fluorogenic molecules include, but are not limited to, coumarins and related dyes 

xanthene dyes such as fluoresceins, rhodols, and rhodamincs, resorufins, cyanine dyes, 
bimancs, aendtnes, isoindolcs, dansyl dyes, aminophthalic hydrazides such as luminol 
and isoluminol derivatives, aminophthalimides, aminonaphthalimides, 



WO 98/13353 



7 



PCT/US97/17395 



arninobenzofurans, aminoquinolines, dicyanohydroquinones, and europium and terbium 
complexes and related compounds. 

"Quencher" refers to a chromophoric molecule or part of a compound that is 
capable of reducing the emission from a fluorescent donor when attached to the donor. 
Quenching may occur by any of several mechanisms including fluorescence resonance 
energy transfer, photoinduced electron transfer, paramagnetic enhancement of 
intersystem crossing, Dexter exchange coupling, and exciton coupling such as the 
formation of dark complexes. 

"Acceptor 1 ' refers to a quencher that operates via fluorescence resonance energy 
transfer. Many acceptors can re-emit the transferred energy as fluorescence. Examples 
include coumarins and related Duorophores, xanthenes such as fluoresceins, rhodols, and 
rhodamines, resorufins, cyanines, difluoroboradiazaindacenes, and phthalocyanines. 
Other chemical classes of acceptors generally do not re-emit the transferred energy. 
Examples include indigos, benzoquinones, anthraquinones, azo compounds, nitro 
compounds, indoanilines, di- and triphenylmethanes. 

"Dye"' refers to a molecule or part of a compound that absorbs specific 
frequencies of light, including but not limited to ultraviolet light. The terms "dye" and 
"chromophore" are synonymous. 

"Fluorophore" refers to a chromophore that fluoresces. 

"Membrane-permeant derivative" refers a chemical derivative of a compound of 
that increases membrane permeability of the compound. These derivatives are made 
better able to cross cell membranes, i.e. membrane permeant, because hydrophilic groups 
are masked to provide more hydrophobic derivatives. Also, the masking groups are 
designed to be cleaved from the fluorogenic substrate within the cell to generate the 
derived substrate mtracellularly. Because the substrate is more hydrophilic than the 
membrane permeant derivative it is now trapped within the cells. 

"Isolated polynucleotide" refers to a polynucleotide of genomic, cDNA, or 
synthetic origin or some combination there of, which by virtue of its origin the "isolated 
polynucleotide" (1 ) is not associated with the cell in which the "isolated polynucleotide" 
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is found in nature, or (2) is opcrably linked to a polynucleotide which it is not linked to in 
nature. 

"Isolated protein" refers to a protein of cDNA, recombinant RNA, or synthetic 
origin or some combination thereof, which by virtue of its origin the "isolated protein" (1 ) 

5 is not associated with proteins found it is normally found with in nature, or (2) is isolated 
from the cell in which it normally occurs or (3) is isolated free of other proteins from the 
same cellular source, e.g. free of human proteins, or (4) is expressed by a cell from a 
different species, or (5) does not occur in nature. 

"Polypeptide" as used herein as a generic term to refer to native protein, 

10 fragments, or analogs of a polypeptide sequence. Hence, native protein, fragments, and 
analogs are species of the polypeptide genus. Preferred, ^-lactamase polypeptides 
include those with the polypeptide sequence represented in the SEQUENCE ID. 
LISTING and any other polypeptide or protein having similar P-lactamase activity as 
measured by one or more of the assays described herein. P-lactamase polypeptide or 

1 5 proteins can include any protein having sufficient activity for detection in the assays 
described herein. 

""Naturally-occurring" as used herein, as applied to an object, refers to the fact that 
an object can be found in nature. For example, a polypeptide or polynucleotide sequence 
that is present in an organism (including viruses) that can be isolated from a source in 
20 nature and which has not been intentionally modified by man in the laboratory is 
naturally-occurring. 

"Operably linked" refers to a juxtaposition wherein the components so described 
are in a relationship permitting them to function in their intended manner. A control 
sequence "operably linked" to a coding sequence is hgared in such a way that expression 
25 of the coding sequence is achieved under conditions compatible with the control 
sequences. 

"Control sequence" refers to polynucleotide sequences which are necessary to 
effect the expression of coding and non-coding sequences to which they are ligated. The 
nature of such control sequences differs depending upon the host organism; in 
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prokaryotes, such control sequences generally include promoter, ribosomal binding site, 
and transcription termination sequence; in eukaryotes, generally, such control sequences 
include promoters and transcription termination sequence. The term "control sequences 1 ' 
is intended to include, at a minimum, components whose presence can influence 
5 expression, and can also include additional components whose presence is advantageous, 
for example, leader sequences and fusion partner sequences. 

"Polynucleotide" refers to a polymeric form of nucleotides of at least 10 bases in 
length, either ribonucleotides or deoxynucleotides or a modified form of either type of 
nucleotide. The term includes single and double stranded forms of DNA. "Genomic 

10 polynucleotide" refers to a portion of a genome. 4 * Active genomic polynucleotide" or 
"active portion of a genome' 1 refer to regions of a genome that can be up regulated, down- 
regulated or both, either directly or indirectly, by a biological process. "Directly," in the 
context of a biological process or processes, refers to direct causation of a process that 
does not require intermediate steps, usually caused by one molecule contacting or binding 

: 5 to another molecule (the same type or different type of molecule). For example, molecule 
A contacts molecule B, which causes molecule B to exert effect X that is part of a 
biological process. "Indirectly," in the context of a biological process or processes, refers 
to indirect causation that requires intermediate steps, usually caused by two or more 
direct steps. For example, molecule A contacts molecule B to exert effect X which in 

20 turn causes effect Y. 

"P-lactamase polynucleotide" refers to a polynucleotide encoding a protein with 
P-lactamase activity. Preferably, the protein with P-lactamase activity can measured be 
in a FACS at 22°degrees using a CCF2-AM p-lactamase substrate at a level of about 
1 ,000 such protein molecules or less per cell. More preferably, the protein with P- 

25 lactamase activity can measured be in a FACS at 22° degrees using a CCF2-AM P- 

lactamase substrate at a level of about 300 to 1,000 such protein molecules per cell. More 
preferably, the protein with P-lactamase activity can measured be in a FACS at 22° 
degrees using a CCF2-AM P-lactamasesubstrate at a level of about 25 to 300 such protein 
molecules per cell. Proteins with P-lactamaseactivity that require more than 1 ,000 
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molecules of such protein per cell for detection with a F ACS at 22° degrees using a 
CCF2-AM P-lactamase substrate can be used and preferably have at least 5% of the 
activity of the protein with SEQ. ID. NO-: 1. 

"Sequence homology" refers to the proportion of base matches between two 

5 nucleic acid sequences or the proportion amino acid matches between two amino acid 
sequences. When sequence homology is expressed as a percentage, e.g., 50% t the 
percentage denotes the proportion of matches over the length of sequence from a desired 
sequence (e.g. P-lactamase sequences, such as SEQ. ID. NO.: 1) that is compared to 
some other sequence. Gaps (in either of the two sequences) are permitted to maximize 

]0 matching; gap lengths of 1 5 bases or less are usually used, 6 bases or less are preferred 
with 2 bases or less more preferred- When using oligonucleotides as probes or treatments 
the sequence homology between the target nucleic acid and the oligonucleotide sequence 
is generally not less than 17 target base matches out of 20 possible oligonucleotide base 
pair matches (85%); preferably not less than 9 matches out of 10 possible base pair 

35 matches (90%), and most preferably not less than 19 matches out of 20 possible base pair 
matches (95%). 

"Selectively hybridize*' refers to detectably and specifically bind. 
Polynucleotides, oligonucleotides and fragments thereof selectively hybridize to target 
nucloc acid strands, under hybridization and wash conditions that minimize appreciable 

20 amounts of detectable binding to nonspecific nucleic acids. High stringency conditions 
can be used to achieve selective hybridization conditions as known in the art and 
discussed herein. Generally, the nucleic acid sequence homology between the 
polynucleotides, oligonucleotides, and fragments thereof and a nucleic acid sequence of 
interest will be at least 30%, and more typically with preferably increasing homologies of 

25 at least about 40%, 50%, 60%, 70%, and 90%. 

Typically, hybridization and washing conditions are performed at high stringency 
according to conventional hybridization procedures. Positive clones are isolated and 
sequenced. For illustration and not for limitation, a full-length polynucleotide 
corresponding to the nucleic acid sequence of SEQ. ID.NO. 1 may be labeled and used as 



WO 98/13353 



PCT/US97/17395 



a hybridization probe to isolate genomic clones from a the appropriate target library in 
/JEMBL4 or XGEM1 1 (Promega Corporation, Madison, Wisconsin); typical 
hybridization conditions for screening plaque lifts (Benton and Davis (1978) Science 196: 
180) can be: 50% formamide, 5 x SSC or SSPE, 1-5 x Denhardt's solution, 0.1-1% SDS, 

5 1 00-200 ^xg sheared heterologous DNA or tRNA, 0-10% dextran sulfate, 1 xlO 5 to 1 x 1 0 : 
cpm/ml of denatured probe with a specific activity of about 1 x 10' cpm/jig, and 
incubation at 42°C for about 6-36 hours. Prehybridization conditions are essentially 
identical except that probe is not included and incubation time is typically reduced. 
Washing conditions are typically 1-3 x SSC, 0.1-1% SDS, 5O-70°C with change of wash 

10 solution at about 5-30 minutes. Cognate sequences, including allelic sequences, can be 
obtained in this manner. 

Two amino acid sequences are homologous if there is a partial or complete 
identity between their sequences. For example, 85% homology means that 85% of the 
amino acids are identical when the two sequences are aligned for maximum matching. 

15 Gaps (in either of the two sequences being matched) are allowed in maximizing 
matching; gap lengths of 5 or less arc preferred with 2 or less being more preferred. 
Alternatively and preferably, two protein sequences (or polypeptide sequences denved 
from them of at least 30 amino acids in length) are homologous, as this term is used 
herein, if they have an alignment score of at more than 5 (in standard deviation units) 

20 using the program ALIGN with the mutation data matrix and a gap penalty of 6 or 

greater. See Dayhoff, M.O., in Atlas of Protein Sequence and Structure, 1972, volume 5, 
National Biomedical Research Foundation, pp. 101-1 10 t and Supplement 2 to this 
volume, pp. 1-10. The two sequences or parts thereof are more preferably homologous if 
their amino acids are greater than or equal to 30% identical when optimally aligned using 

25 the ALIGN program. 

"Corresponds to" refers to a polynucleotide sequence is homologous (i.e., is 
identical, not strictly evolutionarily related) to all or a portion of a reference 
polynucleotide sequence, or that a polypeptide sequence is identical to all or a portion of 
a reference polypeptide sequence. In contradistinction, the term "complementary to" is 
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used herein to mean that the complementary sequence is homologous to all or a portion of 
a reference polynucleotide sequence. For illustration, the nucleotide sequence "TATAC" 
corresponds to a reference sequence "TAT AC" and is complementary to a reference 
sequence "GTATA". 

5 The following terms are used to describe the sequence relationships between two 

or more polynucleotides: "reference sequence," "comparison window," "sequence 
identity/' "percentage of sequence identity/' and "substantial identity/' A "reference 
sequence" is a defined sequence used as a basis for a sequence comparison; a reference 
sequence may be a subset of a larger sequence, for example, as a segment of a full-length 

10 cDNA or gene sequence given in a sequence listing such as a SEQ. ID. NO*: 1, or may 
comprise a complete cDNA or gene sequence. Generally, a reference sequence is at least 
20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 
nucleotides in length. Since two polynucleotides may each (1) comprise a sequence (i.e., 
a portion of the complete polynucleotide sequence) that is similar between the two 

15 polynucleotides, and (2) may further comprise a sequence that is divergent between the 
two polynucleotides, sequence comparisons between two (or more) polynucleotides are 
typically perfonned by comparing sequences of the two polynucleotides over a 
"comparison window" to identify and compare local regions of sequence similarity. A 
"comparison window", as used herein, refers to a conceptual segment of at least 20 

20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a 
reference sequence of at least 20 contiguous nucleotides and wherein the portion of the 
polynucleotide sequence in the comparison window may comprise additions or deletions 
(i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not 
comprise additions or deletions) for optimal alignment of the two sequences. Optimal 

25 alignment of sequences for aligning a comparison window may be conducted by the local 
homology algorithm of Smith and Waterman (1981 ) Adv. Appl. Math. 2: 482, by the 
homology alignment algorithm of Needleman and Wunsch (1970) J. Mo!. Biol. 48: 443, 
by the search for similarity method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. 
(U.S.A.) 85 : 2444, by computerized implementations of these algorithms (GAP, 
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BESTT7T, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 
7.0, Genetics Computer Group, 575 Science Dr., Madison, WI), or by inspection, and the 
best alignment (i.e., resulting in the highest percentage of homology over the comparison 
window) generated by the various methods is selected. The term "sequence identity" 
5 means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide 
basis) over the window of comparison. The term "percentage of sequence identity" is 
calculated by comparing two optimally aligned sequences over the window of 
comparison, determining the number of positions at which the identical nucleic acid base 
(e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched 

10 positions, dividing the number of matched positions by the total number of positions in 
the window of comparison (i.e., the window size), and multiplying the result by 100 to 
yield the percentage of sequence identity. The terms "substantia] identity" as used herein 
denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide 
comprises a sequence thai has at least 30 percent sequence identity, preferably at least 50 

15 to 60 percent sequence identity, more usually at least 60 percent sequence identity as 
compared to a reference sequence over a comparison window of at least 20 nucleotide 
positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage 
of sequence identity is calculated by comparing the reference sequence to the 
polynucleotide sequence which may include deletions or additions which total 20 percent 

20 or less of the reference sequence over the window of comparison. 

As applied to polypeptides, the term "substantial identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using 
default gap weights, share at least 30 percent sequence identity, preferably at least 40 
percent sequence identity, more preferably at least 50 percent sequence identity, and most 

25 preferably at least 60 percent sequence identity. Preferably, residue positions, which are 
not identical, differ by conservative amino acid substitutions. Conservative ammo acid 
substitutions refer to the interchangeability of residues having similar side chains. For 
example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, 
leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is 
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serine and threonine; a group of amino acids having amide-containing side chains is 
asparagine and glutamine; a group of amino acids having aromatic side chains is 
phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains 
is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing 

5 side chains is cysteine and methionine. Preferred conservative amino acids substitution 
.groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-argimne, alanine- 
valine, glutamic- aspamc, and asparagine-glutamine. 

"Polypeptide fragment" refers to a polypeptide that has an amino-terminal and/or 
carboxy-termmal deletion, but where the remaining amino acid sequence is usually 

10 identical to the corresponding positions in the naturally-occurring sequence deduced, for 
example, from a full-length cDNA sequence (e.g., the sequence shown in SEQ. ID. NO.: 
1 ). "3- lactamase polypeptides fragment" refers to a polypeptide that is comprised of a 
segment of at least 25 amino acids that has substantial identity to a portion of the deduced 
amino acid sequence shown in SEQ. ID. NO,:l and which has at least one of the 

15 following properties: (1 ) specific binding to a p-Iactamase substrate, preferably 
cephalosporin, under suitable binding conditions, or (2) the ability to effectuate 
enzymatic activity, preferably cephalosporin backbone cleavage activity, when expressed 
in a mammalian cell. Typically, analog polypeptides comprise a conservative amino acid 
substitution (or addition or deletion) with respect to the naturally occurring sequence. 

20 Analogs typically are at least 300 ammo acids long, preferably at least 500 amino acids 
long or longer, most usually being as long as full-length naturally-occurring polypeptide. 

"Modulation " refers to the capacity to either enhance or inhibit a functional 
property of a biological activity or process (e.g., enzyme activity or receptor binding). 
Such enhancement or inhibition may be contingent on the occurrence of a specific event, 

25 such as activation of a signal transduction pathway, and/or may be manifest only in 
particular cell types. 

The lerm "modulator' 1 refers to a chemical (naturally occurring or non-naturally 
occurring), such as a biological macromolecule (e.g. nucleic acid, protein, non-peptide, or 
organic molecule), or an extract made from biological materials such as bacteria, plants. 
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fungi, or animal (particularly mammalian) cells or tissues. Modulators are typically 
evaluated for potential activity as inhibitors or activators (directly or indirectly) of a 
biological process or processes (e.g., agonist, partial antagonist, partial agonist, 
antagonist, antineoplastic agents, cytotoxic agents, inhibitors of neoplastic transformation 
5 or cell proliferation, cell proliferation-promoting agents, and the like) by inclusion in 
assays described herein. The activity of a modulator may be known, unknown or partial 
known. 

The term "test chemical" refers to a chemical to be tested by one or more 
method(s) of the invention as a putative modulator. A test chemical is usually not known 

10 to bind to the target of interest. The term "control test chemical" refers to a chemical 
known to bind to the target (e.g., a known agonist, antagonist, partial agonist or inverse 
agonist). The term ''test chemical" does not typically include a chemical added as a 
control condition that alters the function of the target to determine signal specificity in an 
assay. Such control chemicals or conditions include chemicals that 1) non-specifical!y or 

15 substantially disrupt protein structure (e.g., denaturing agents (e.g., urea or guandium), 
charotropic agents, sulfhydryi reagents (e.g., dithiotritol and fi-mercaptoethanol), and 
proteases), 2) generally inhibit cell metabolism (e.g., mitochondrial uncouplers) and 3) 
non-specificaliy disrupt electrostatic or hydrophobic interactions of a protein (e.g., high 
salt concentrations, or detergents at concentrations sufficient to non-specifically disrupt 

20 hydrophobic interactions). The term "test chemical" also does not typically include 

chemicals known to be unsuitable for a therapeutic use for a particular indication due to 
toxicity of the subject. Usually, various predetermined concentrations test chemicals are 
used for screening such as .01 nM, .1 |iM, 1.0 nM, and 10.0 nM. 

The term "target" refers to a biochemical entity involved a biological process. 

25 Targets are typically proteins that play a useful role in the physiology or biology of an 
organism. A therapeutic chemical binds to target to alter or modulate its function. As 
used herein, targets can include cell surface receptors, G-proteins, kinases, ion channels, 
phopholipases and other proteins mentioned herein. 
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The terms "label" or "labeled" refers to incorporation of a delectable marker, e.g., 
by incorporation of a radiolabeled amino acid or attachment to a polypeptide of biotinyl 
moieties that can be detected by marked avidin (e.g., streptavidin containing a fluorescent 
marker or enzymatic activity that can be detected by optical or colorimetric methods). 

5 Various methods of labeling polypeptides and glycoproteins are known in the art and may 
be used. Examples of labels for polypeptides include, but are not limited to, the 
following: radioisotopes (e.g., 3 H, ,4 C, )5 S, l2S I, m I), fluorescent labels (e.g., FITC, 
rhodamine, and lanthamde phosphors), enzymatic labels (or reporter genes) (e.g., 
enzymatic reporter genes horseradish peroxidase, P-galactosidasc, luciferase and alkaline 

10 phosphatase; and non-enzymatic reporter genes (e.g., fluorescent proteins)), 

chemilumincscent, biotinyl groups, predetermined polypeptide epitopes recognized by a 
secondary reporter (e.g., leucine zipper pair sequences, binding sites for secondary 
antibodies, metal binding domains, epitope tags). "Substantially pure" refers to an object 
species is the predominant species present (i.e., on a molar basis it is more abundant than 

15 any other individual species in the composition), and preferably a substantially purified 
fraction is a composition wherein the object species comprises at least about 50 percent 
(on a molar basis) of all macromolecular species present. Generally, a substantially pure 
composition will comprise more than about 80 percent of all macromolecular species 
present in the composition, more preferably more than about 85%, 90%, 95%, and 99%. 

20 Most preferably, the object species is purified to essential homogeneity (contaminant 
species cannot be detected in the composition by conventional detection methods) 
wherein the composition consists essentially of a single macromolecular species. 

"Pharmaceutical agent or drug" refers to a chemical or composition capable of 
inducing a desired therapeutic effect when properly administered (e.g. using the proper 

25 amount and delivery modality) to a patient. 

Other chemistry terms herein are used according to conventional usage in the art, 
as exemplified by The McGraw-Hill Dictionary of Chemical Terms (ed. Parker, S., 
1985), McGraw-Hill, San Francisco, incorporated herein by reference). 
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Introduction 

The present invention recognizes that p-lactamase polynucleotides can be 
effectively used in living eukaryotic cells to functionally identify active portions of a 
genome directly or indirectly associated with a biological process. The present invention 
also recognizes for the first time that p-lactamase activity can be measured using 
membrane permeant substrates in living cells incubated with a test chemical that directly 
or indirectly interacts with a portion of the genome having an integrated 
P-lactamase polynucleotide. The present invention, thus, permits the rapid identification 
and isolation of genomic polynucleotides indirectly or directly associated with a defined 
biological process and identification of compounds that modulate such processes and 
regions of the genome. Because the identification of active genomic polynucleotides is 
permitted in living cells, further functional characterization can be conducted using the 
same cells, and optionally, the same screening assay. The ability to functionally screen 
immediately after the rapid identification of a functionally active portion of a genome, 
without the necessity of transferring the identified portion of the genome into a secondary 
screening system, represents, among other things, a distinct advantage over an application 
of a prior art reporter gene and methods described herein, as shown in FIG. 1 . 

As a non-limiting introduction to the breadth of the invention, the invention 
includes several general and useful aspects, including: 

1) a method for identifying genes or gene products directly or indirectly 
associated (e.g. regulated) with a biological process of interest (that can be 
modulated by a compound) using a genomic polynucleotide operably linked 
to a polynucleotide encoding a protein with p-lacatamasc activity, 

2) a method for identifying proteins (e.g. orphan proteins or known proteins) 
or compounds that directly or indirectly modulate (e.g. activate or inhibit 
transcription) a genomic polynucleotide operably linked to a polynucleotide 
encoding a protein with P-lactamase activity, 

3) a method of screening for an active genomic polynucleotide (e.g. enhancer, 
promoter or coding region in the genome) that can be directly or indirectly 
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associated (e.g. regulated) with a biological process of interest (that can be 
modulated by a compound) using a genomic polynucleotide operably linked 
to a polynucleotide encoding a protein with p-lactamase activity that can be 
detected by FACS using a fluorescent, membrane penneant p-lactamase 
substrate, 

4) eukaryotic cells with a genomic polynucleotide operably linked to a 
polynucleotide encoding a protein with P-lactamase activity t and 

5) polynucleotides related to the above methods and cells. 

These aspects of the invention, as well as others described herein, can be achieved by using the methods 
and compositions of matter described herein. To gain a full appreciation of the scope of the invention, it 
will be further recognized that various aspects of the invention can be combined to make desirable 
embodiments of the invention. For example, the invention includes a method of identifying compounds 
that modulate active genomic polynucleotides operably linked to a protein with ^-lactamase activity that 
can be detected by FACS using a fluorescent, membrane permeam P-lactamase substrate. Such 
combinations result in particularly useful and robust embodiments of the invention. 
Methods for Rapidly Identifying Functional Portions of a Genome 

The invention provides for a method of identifying portions of a genome, e.g. 
genomic polynucleotides, in a living cell using a polynucleotide encoding a protein with 
p-lactamase activity that can be detected with a membrane permeani p-lactamase 
substrate. Typically, the method involves inserting a polynucleotide encoding a protein 
with p-lactamase activity into the genome of an organism using any method known in the 
art, developed in the future or described herein. Usually, a P-lactamase expression 
construct will be used into integrate a P-lactamase polynucleotide into a eukaryotic 
genome, as described herein. The cell, such as a eukaryotic cell, is usually contacted with 
a predetermined concentration of a modulator, either before or after integration of the P- 
lactamase polynucleotide. P-lactamase activity is usually then measured inside the living 
cell, preferably with fluorescent, membrane pcrmeam P-lactamase substrates that are 
transformed by the cell into membrane impermeant P-lactamase substrates as described 
herein and PCT Publication No. WO96/30540 published October 3, 1996, by Tsien et al 
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Once P-iactamase polynucleotides are integrated into the genome of interest, they 
become under the transciptional control of the genome of the host cell. Integration into 
the genome is usually stable, as described herein and known in the art Transcriptional 
control of the genome often results from receptor (e.g. intracellular or cell surface 
5 receptor) activation, which can regulate transcriptional and translational events to change 
the amount of protein present in the cell. The amount of protein present with (J-lactamase 
activity can be measured via its enzymatic action on a substrate. Normally, the substrate 
is a small uncharged molecule that, when added to the extracellular solution, can 
penetrate the plasma membrane to encounter the enzyme. A charged molecule can also 

10 be employed, but the charges are generally masked by groups that will be cleaved by 
endogenous or heterologous cellular enzymes or processes (e.g., esters cleaved by 
cytoplasmic esterases). As described more fully herein and in PCT Publication No. 
WO96/30540 published October 3, 1996, by Tsien el al., which is herein incorporated by 
reference, the use of substrates that exhibit changes in their fluorescence spectra upon 

15 interaction with an enzyme are particularly desirable. In some assays, the fluorogenic 
substrate is converted to a fluorescent product by [^-lactamase activity. Alternatively, the 
fluorescent substrate changes fluorescence properties upon conversion by (3-lactamase 
activity. Preferably, the product should be very fluorescent to obtain a maximal signal, 
and very polar, to stay trapped inside the cell. 

20 Vectors and Integration 

Vectors, such as viral and plasmid vectors, can be used to introduce genes or 
genetic material of the invention into cells, preferably by integration into the host cell 
genome. Such viral vectors can be any appropriate viruses, such as retroviruses, 
adenoviruses, adeno-associated viruses, papillomaviruses, herpes viruses, or any 

25 ecotropic or amphitropic virus, preferably a retrovirus. The viruses can be, for example, 
retroviruses or any other virus modified to be replicatively deficient, cytomegalovirus, 
Friend leukemia vims, SIV, HIV, Rouse Sarcoma Virus, or Maloney virus such as 
Moloney murine leukemia virus. 
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Vectors, such as retrovirus vectors, can encode an operable selective protein so 
that cells that have been transformed can be positively selected for Such selective 
protein can be antibiotic resistance factors, such as neomycin resistance, such as NEO. 
Alternatively, cells can be negatively selected for using an enzyme, such as herpes 

5 simplex vims thymidine kinase (HSVTK) that transforms a pro-toxin into a toxin. Viral 
vectors, such as retroviral vectors, are available that are suitable for these pmposes, such 
as PSIR vector (available from ClonTech of California with PT67 packaging cells) 
GglBHisen and GgTNKneolD and GgTKNeoen variants of Moloney munne leukemia 
virus, are available. Vector modifications can be made that allow more efficient 

10 integration into the host cell genome. Such modifications include sequences that enhance 
integration or known methods to promote nucleic acid transportation into the nucleus of 
the host cell Retro-viral vectors are those described in U.S. Patent Number 5,364,783 by 
Ruley and von Melchner can also be used to increase transfection efficiency. 

Vectors can also be used with liposomes or other vesicles that can transport 

15 genetic material into a cell. Appropriate structures are known in the art. The liposomes 
can include vectors such as plasmids or yeast artificial chromosomes (Y ACs), which can 
include genetic material to be introduced into the cell. Plasmids can also be introduced 
into cells by any known methods, such as electroporation, calcium phosphate, or 
lipofection. DNA fragments, without a plasmid or viral vector can also be used. 

20 In one aspect of the present invention, vectors are used to introduce reporter genes 

into cells. When the reporter gene integrates into the genome of a target cell so that the 
reporter gene is expressed, that event can be detected by delecting the reporter gene. 
Clones that express the reporter gene under a wide variety of conditions can be used for a 
variety of purposes, including gene and drug discovery. Chromosomes tagged with^ 

25 lactamase expression constructs can be transferred to desired receipt cells using methods 
established in the art. 

P-lactamase polynucleotides can be placed on a variety of plasmids for integration 
into a genome and to identify genes from a large variety of organisms (Gorman, CM. et 
aL Mol. Cell Biol. 2: 1044-1051 (1982), Alam, J. and Cook, J.U Anal.Biochem. 188: 
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245-254, (1 990)). Standard techniques are used to introduce these polynucleotides into a 
cell or whole organism (e.g., as described in Sambrook, J M Fritsch, E.F. and Maniatis, T. 
Expression of cloned genes in cultured mammalian cells. In: Molecular Cloning, edited 
by Nolan, C. New York: Cold Spring Harbor Laboratory Press, 1989). Resistance 
markers can be used to select for successfully transfected cells. 

If a p-lactamase expression construct is selected for integrating a p-lactamase 
polynucleotide into a eukaryotic genome, it will usually contain at least a p-lactamase 
polynucleotide operably linked to a splice acceptor and optionally a splice donor. 
Alternatively, the P-lactamase polynucleotide may be operably linked to any means for 
integrating a polynucleotide into a genome, preferably for integration into an intron of a 
gene to produce an in frame translation product. The P-lactamase expression construct 
can optionally compnse, depending on the application, an IRES element, a splice donor, a 
poly A site, translational start site (e.g. a Kozak sequence) an LTR (long terminal repeat) 
and a selectable marker. 
(^Lactamase Reporter Genes 

Preferably, P-lactamase polynucleotides encode a cytosolic form of a protein with 
p-lactamase activity. This provides the advantage of trapping the normally secreted P- 
laciamase protein within the cell, which enhances signal to noise ratio of the signal 
associated with P-lactamase activity. Usually, this is accomplished by removing or 
disabling the signal sequence normally present for secretion. As used herein, "cytosolic 
protein with P-lactamase activity " refers to a protein with P-lactamase activity that lacks 
the proper ammo acid sequences for secretion from the cell, e.g., the signal sequence. For 
example, in the polypeptide of SEQ. ID NO.: L the signal sequence has been replaced 
with the amino acids Met-Ser. Accordingly, upon expression, P-lactamase activity 
remains within the cell. For expression in mammalian cells it is preferably to use p- 
lactamase polynucleotides with nucleotide sequences preferred by mammalian cells. In 
some instances, a secreted form of^-lactamase can be used with the methods and 
compositions of the invention. In particular, genes having sequences that direct selection 
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can be identified with a (word)^ lactamase assay. This also permits multiplying based on 
directed localization of^lactamase. 

Proteins with P-lactamase activity can be any known to the ait, developed in the 
future or described herein. This includes, for example, the enzymes represented by SEQ. 
5 ID. NO.'s described herein. Nucleic acids encoding proteins with P-lactamase activity 
can be obtained by methods known in the art, for example, by polymerase chain reaction 
of cDNA using primers based on the DNA sequence in SEQ. ID. NO.: 1. PCR methods 
are described in, for example, U.S. Pat. No. 4,683,195; Mullis et. al. (1987) Cold Spring 
Harbor Symp. Quant Biol 51:263; and Erlich, ed., PCR Technology, (Stockton Press, 

10 NY, 1989). 

Sequences for Assisting Integration 

The p-lactamase expression construct typically includes sequences for integration, 
especially sequences designed to target or enhance integration into the genome. 
The splice site acceptor can be operably linked to the P-lactamase polynucleotide to 

15 facilitate expression upon integration into an intron. Usually, a fusion RNA will be 
created with the coding region of an adjacent operably portion of the exon. A splice 
acceptor sequence is a sequence at the 3' end of an intron where it junctions with an exon. 
The consensus sequences for a splice acceptor is NTN (TC) (TC) (TC) TTT (TC) 
(TQ(TC) (TC) (TC) (TC) NCAGgl. The intronic sequences are represented by upper 

20 case and the exonic sequence by lower case font. These sequences represent those that 
are conserved from viral to primate genomes. 

The splice donor site can be operably linked to the P-lactamase polynucleotide to 
facilitate integration in an intron to promote expression by requiring a poly-adenylation 
sequence. Usually, a fusion RNA is created with the coding region or untranslated on the 

25 3' end of the P-lactamase polynucleotide. This is preferred when it is desired to sequence 
the coding region of the identified gene. A splice donor is a sequence at the 5' end of an 
intron where it junctions with an exon. The consensus sequence for a splice donor 
sequence is naggt(ag)aGT. The intronic sequences are represented by upper case and the 
exonic sequence by lower case font. These sequences represent those that are conserved 



WO 98/13353 



PCT/US97/17395 



23 



from viral to pnmate genomes. This splice donor allows identification of the target gene 
using 3* RACE. 

As an alternative to a splice donor site, a poly A site may be operably linked to 
the P-lactamase polynucleotide. Poly-adenylation signals, i.e poly A sites, include SV40 
5 poly A sites, such as those described in the Invitrogen Catalog 1996 (California). 

In some instances, it may be desirable to include in the P-lactamase expression construct 
a translational start site. For instance, a translational start site allows for P-lactamase 
expression even if the integration occurs in non-coding regions. Usually, such sequences 
will not reduce the expression of a highly expressed gene. Translational start sites 

10 include a "Kozak sequence" and are the preferred sequences for expression in mammalian 
cells described in Kozak, M. f J. Cell Biol 108: 229-241 (1989). The nucleotide sequence 
for a cytosolic protein with p-lactamase activity in SEQ. ID, NO.: 3 contains a Kozak 
sequences for the nucleotides -9 to 4 (GGTACCACCATGA). 

It is also preferable, when using mammalian cells, to include an IRES ("'internal 

15 ribosome entry binding site") element in the p-lactamase expression construct. Typically, 
an IRES element will improve the yield of expressing clones. One caveat of integration 
vectors is that only one in three insertions into an intron will be in frame and produce a 
functional reporter protein. This limitation can be reduced by cloning an IRES sequence 
between the splice acceptor site and the reporter gene (e.g., a P-lactamase 

20 polynucleotide). This eliminates reading frame restrictions and possible functional 

inactivation of the reporter protein by fusion to an endogenous protein. IRES elements 
include those from piconaviruses, pi coma-related viruses, and hepatitis A and C. 
Preferably, the IRES element is from a poliovirus. Specific IRES elements can be found, 
for instance, in W0961 121 1 by Das and Coward published 4/16/96, EP 585983 by Zurr 

25 published 3/7/96, WO9601324 by Berlioz published 1/18/96 and WO9424301 by Smith 
published October 27, 1994, all of which are herein incorporated by reference. 
To improve selection of p-lactamase polynucleotide into a genome, a selectable marker 
can be used in the p-lactamase expression construct. Selectable markers for mammalian 
cells are known in the art, and include for example, thymidine kinase, dihydrofolate 
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reductase (together with methotrexate as a DHFR amplifier), aminoglycoside 
phosphotransferase, hygromycin B phosphotransferase, asparagine synthetase, adenosine 
deaminase, metallothionien, and antibiotic resistant genes such as genes neomycin 
resistance. Selectable markers for non-mammalian cells are known in the an and include 

5 genes providing resistance to antibiotics, such as kanamycin, tetracycline, and ampicillin. 
The invention can be readily practiced with genomes having intron/exon 
structures. Such genomes include those of mammals (e.g., human, rabbit, mouse, rat, 
monkey, pig and cow), vertebrates, insects and yeast. Intron-targeted vectors are more 
commonly used in mammalian cells as introns, or intervening sequences, arc 

10 considerably larger than exons, or mRNA coding regions in mammals. Intron targeting 
can be achieved by cloning a splice acceptor or 3' intronic sequences upstream of a p- 
lactamase polynucleotide gene followed by a pofyadenylation signal or 5' intronic splice 
donor site. When the vector inserts into an intron, the reporter gene (e.g., p-lactamase) is 
expressed under the same control as the gene into which it has inserted. 

15 The invention can also be practiced with genomes having reduced numbers of, or 

lacking, intron/exon structures. For lower eukaryotes, which have simple genomic 
organization, i.e. containing few and small introns, exon-targeted vectors can be used. 
Such vectors include P-lactamase polynucleotides operably linked to a poly-adenylation 
sequence and optionally to an IRES element. Lower eukaryotes include yeast, and fungi 

20 and pathogenic eurokaryotes (e.g. parasites and microoganisms). For genomes lacking 
intron/exon structures restriction enzyme integration, transposon induced integration or 
selection integration can be used for genomic integration. Such methods include those 
described by Kuspa and Loomis, PNAS 89: 8803-8807 (1992) and Derbyshire, K.M., 
Gene Nov. 7: 143-144 (1995). Prokaryotes can be used with the invention if integration 

25 can occur in such genomes. Retroviral vectors can also be used to integrate p-lactamase 
polynucleotides into a genome (e.g., eukaryotic), such as those methods and composition 
described in U.S. Patent Number 5,364,783. 

Typically, integration will occur in the regions of the genome that are accessible 
to the integration vector. Such regions are usually active portions of the genome where 
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there is increased genome regulatory activity, e.g. increased polymerase activity or a 
change in DN A binding by proteins that regulate transcription of the genome. Many 
embodiments of the invention described herein can result in random integration, 
especially in actively transcribed regions. 
5 Integration into Active Portions of the Genome 

Integration, however, can be directed to regions of the genome active during 
specific types of genome activity. For instance, integration at sites in the genome that are 
active during specific phases of the cell cycle can be promoted by synchronizing the cells 
in a desired phase of the cell cycle. Such cell cycle methods include those known in the 

10 art, such as serum deprivation or alpha factors (for yeast). Integration may also be 

directed to regions of the genome active during cell regulation by a chemical, such as an 
antagonist or agonist for a receptor or some other chemical that increase or decreases or 
otherwise modulates genome activity. By adding the chemical of interest, genome 
activity can be increased, often tn specific regions to promote integration of an integration 

15 vector (e.g. as a reporter gene construct), including those of the invention, into such 
regions of the genome. 

For instance, a nuclear receptor activator (general or specific) could applied to 
activate the cells prior or during integration in order to promote integration of reporter 
genes at sites in the genome that become more active dunng nuclear receptor activation. 

20 Such cells could then be screened with the same or different nuclear receptor activator to 
identify which clones, and which portions of the genome are active during nuclear 
receptor activation. Any agonists, antagonists and modulators of the receptors described 
herein can be used in such a manner, as well as any other chemicals that increase or 
decrease genome activity. 

25 Cells for Integration into the Genome 

The cells used in the invention will typically correspond to the genome of interest. 
For example, if regions of the human genome are desired to be identified, then human 
cells containing a proper genetic complement will generally be used. Libraries, however, 
could be biased by using cells that contain extra-copies of certain chromosomes or other 
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portions of the genome. Cells thai do not correspond to the genome of interest can also 
be used if the genome of interest or significant portions of the genome of interest can be 
replicated in the cells, such as making a human-mouse hybrid. 

Additionally, by the appropriate choice of cells and expressed proteins, 
5 identification and screening assays can be constructed that detect active portions of the 
genome associated with a biological process that requires, in whole or part, the presence 
of a particular protein (protein of interest). Cells can be selected depending on the type of 
proteins that are expressed (homologously or hetcrologously) or from the type of tissue 
from which the cell line or explant was originally generated. If the identification of 

10 portions of the genome activated by a particular type of protein is desired, then the cell 
used should express that protein. 

The cells can express the protein homologously, i.e. expression of the desired 
protein normally or naturally occurs in the cells. Alternatively, the cells can be directed 
to express a protein heterologously, i.e. expression of the desired protein which does not 

15 normally or naturally occur in the cells. Such heterologous expression can be directed by 
"turning on" the gene in the cell encoding the desired protein or by transfecting the cell 
with a polynucleotide encoding the desired protein (either by constitutive expression or 
inducible expression). Inducible expression is preferred if it is thought that the expressed 
protein of interest may be toxic to the cells. 

20 Many cells can be used with the invention. Such cells include, but are not 

limited to adult, fetal, or embryonic cells. These cells can be derived from the 
mesoderm, ectoderm, or endoderm and can be stem cells, such as embryonic or adult 
stem cells, or adult precursor cells. The cells can be of any lineage, such as vascular, 
neural, cardiac, fibroblasts, lymphocytes, hepatocytes, cardiac, hematopoeilic, 

25 pancreatic, epidermal, myoblasts, or myocytes. Other cells include baby hamster kidney 
(BHK) cells (ATCC No CCL10), mouse L cells (ATCC No. CCLI.3), Jurkats (ATCC 
No. TIB 152) and 153 DG44 cells (see, Chasin (1986) Cell. Molec. Genet. 12: 555) 
human embryonic kidney (HEK) cells (ATCC No. CRL1573), Chinese hamster ovary 
(CHO) cells (ATCC Nos. CRL9618, CCL61, CRL9096), PCI 2 cells (ATCC No. 
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CRL1 7.21) and COS-7 cells (ATCC No. CRL1651). Preferred cells include Jurkat cells, 
CHO cells, neuroblastoma cells, P19 cells, Fl 1 cells, NT-2 cells, and HEK 293 cells, 
such as those described in U.S. Patent No. 5,024,939 and by Stillman et a!. Mol. Cell. 
BioJ. 5: 205 ] -2060 ( 1 985). Preferred cells for heterologous protein expression are those 
5 that can be readily and efficiently transfected. 

Cells used in the present invention can be from continuous cell lines or primary 
cell lines obtained from, for example, mammalian tissues, organs, or fluids. Tissue 
sections as well as disperse cells can be used in the present invention. Cells can also be 
obtained from transgenic animals that have been engineered to express a reporter gene. 

10 Cells obtained from transgenic or non-transgenic animals are preferred for cells that are 
difficult to culture in vitro, such as neural and hepatic cells. Primary cell lines can be 
made continuous using known methods, such as fusing primary cells with a continuous 
cell line or expressing transforming proteins. Cells of the invention can be stored or used 
with methods of the invention as isolated, clonal populations in plates such as those 

15 described in commonly owned United States Patent Applications having Attorney Docket 
Nos: 08366/010001, entitled "Low background multi-well plates and platforms for 
spectroscopic measurements** (Coassin et al., filed June 2, 1997); and 08366/009001, 
entitled "Low background multi-well plates with greater than 864 wells for spectroscopic 
measurements" (Coassin et al., filed June 2, 1997); each of which is incorporated herein 

20 by reference. Preferably, cells are stored or used in plates with 96, 384, 1536 or 3456 
wells per plate. A single cell or a plurality of cells can be placed in such wells. Such 
isolated clonal populations will typically have 1,000, 10,000, or 100,000 or more such 
populations representative of substantially equivalent numbers of independent 
integrations sites. Such panels can be used in profiling, pathway identification, 

25 modulator identification, modulator characterization, and other methods of the invention. 

Prior to being transfected with a trapping vector of the present invention, cells can 
be transfected with an exogenous gene capable of expressing an exogenous protein, such 
as a receptor (e.g., GPCR) or gene associated with the pathology of an etiological agent, 
such as a virus, bacteria, or parasite. Cells that express such exogenous proteins can then 
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be transfectcd with a trapping vector to form a library of clones that can be screened 
using the present invention. The invention can also include animals with P-lacatmase 
expression constructs integrated into the genome of interest. 

Many of the cells of the present invention can report modulation of biological 
5 processes by a variety of additional reporter genes or chemicals or combinations thereof. 
For example, beta-lactamase, an enzyme, can convert non-chromogenic substrates to 
chormogenic products or alter the chromogenic or fluorescent properties of a substrate 
such as CCF2. Furthermore, fluorescent reporters, such as fluorescent proteins, such as 
green fluorescent protein (GFP) molecules, can be used as reporters. Some mutant GFP 
10 molecules have different fluorescent properties as compared to wild-type GFP. These 
GFPs can be used as reporters and can be used singly or in combination with the present 
invention. For example, cells can have multiple reporters that can be differentiated to 
report different biological processes, or different steps within a biological process, such 
as steps in a signal transduction pathway. 
15 Targets 

Proteins of interest that can be expressed in the cells of the invention include,: 
hormone receptors (e.g. mineralcorticosteroid, glucocorticoid, and thyroid hormone 
receptors); intracellular receptors (e.g., orphans, retinoids, vitamin D3 and vitamin A 
receptors); signaling molecules (e.g., kinases, transcription factors, or molecules such 

20 signal transducers and activators of transcription) (Science Vol. 264, 1994, p. 1415-1421; 
Moi Cell Bioi y Vol. 16, 1996, p.369-375); receptors of the cytokine superfamily (e.g. 
erthyropoietin, growth hormone, interferons, and interleukins (other than 1L-8) and 
colony-stimulating factors); G-protein coupled receptors, see US patent 5,436,128 (e.g., 
for hormones, calcitonin, epinephrine, gastrin, and pancrine or autocrine mediators, such 

25 as stomatostatin or prostaglandins) and neurotransmitter receptors (norepinephrine, 

dopamine, serotonin or acetylcholine); tyrosine kinase receptors (such as insulin growth 
factor, nerve growth factor (US patent 5,436,128)). Examples of the use of such proteins 
is further described herein. 
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Any target, such as an intracellular or extracellular receptor involved in a signal 
transduction pathway, such as the leptin or GPCR pathways, can be used with the present 
invention. Furthermore, the genes activated or repressed by a target can be isolated, 
identified, and modulators of that gene identified using the present invention. For 
5 example, the present invention can identify a G-protein coupled receptor (GPCR) 

pathway, determine its function, isolated the genes modulated by the GPCR, and identify 
modulators of such GPCR modulated proteins. 

As an introduction to GPCR cell biology, the activation of Gcc I5 or Gct t6 can, 
through a G-protein signaling pathway, activate PLC0, which in turn increases 

10 intracellular calcium levels. An increase in calcium levels can lead to modulation of a 
"calcium-responsive*' promoter that is part of a signal transduction detection system, i.e., 
a promoter that is activated (e.g., a NFAT promoter AP-I) or inhibited by a change in 
calcium levels. One example of an NFAT DNA binding site is described in Shaw, et ah 
Science 29 1 :202-205 ( 1 988). Likewise, a promoter that is responsive to changes in 

15 protein kinase C levels (e.g., a "protein kinase C-responsive promoter") can be modulated 
by an active Ga protein through G-protein signaling pathway. Selected cells described 
herein can also include a G-protein coupled receptor. Genes encoding numerous GPCRs 
have been cloned (Simon et al.. Science 252:802-808 (1991)), and conventional 
molecular biology techniques can be used to express a GPCR on the surface of a cell of 

20 the invention. Preferably, the sum responsive promoter can allow for only a relatively 
short lag (e.g., less than 90 minutes) between engagement of the GPCR and 
transcriptional activation. A preferred responsive promoter includes the nuclear factor of 
activated T-cell promoter (Flanagan et al., Nature 352:803-807 (1991)). Polynucleotides 
identified by methos of the invention can be used as response elements that are sensitive 

25 to intracellular signals (signal-response elements). Signal response elements can be used 
in the assays described herein, such as identification of useful chemicals. Such signal 
response elements may sensitive intracellular signals that include voltage, pH, and 
intracellular levels of Ca~, ATP, ADP, cAMP, GTD, GDP, K\ Na+, Zn++, oxygen, 
metabolites and IP3. 
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In one aspect of the present invention, cells can be transformed to express an 
exogenous receptor, such as GPCR. Such a transduced cell line can than be futher 
transduced with a trapping vector to make a library of clones that can be used to identify 
cells that report modulation of the exogenous receptor. Preferably, the hosl cell line 

5 would not appreciably express the exogenous receptor. 

Based on the unique structure of GPCRs, which have seven hydrophobic, 
presumably trans-membrane, domains (see, Watson and Arkinstall, The G-Protein Linked 
Receptor Facts Book , Academic Press, New York (1994)) orphan GPCRs (GPCRs 
having no known function) can be identified by searching sequence databases, such as 

10 those provided by the National Library of Medicine (Bethesda, MD), for similar motifs 
and homologies. This same strategy can, of course, be used for any target, especially 
when a paradigm sequence or motif has been determined. 
Drug Discovery for Viruses and Other Pathogens 

The function of genes from viruses or other pathogens that effect the expression 

15 of genes in cells, such as mammalian cells, can be determined using the present 

invention. Furthermore, chemicals that modulate these genes can be identified using the 
methods of the present invention. For example, many transforming viruses, after infecting 
a cell, have the effect of up-regulating genes involved in cell proliferation, which allows 
the virus-infected cells to produce additional viruses, which can infect additional cells. 

20 These transforming viruses can act by stimulating a receptor from the target cell- 
One example of the mechanism is the Fnend Erythroleukemia virus. This virus uses the 
erythropoettn receptor for entry into the cells. When the virus is bound to the receptor, a 
pathway is activated that causes an over-proliferation of red blood ceils. If the activation 
of the erythropoetin receptor is inhibited, a decrease in the accumulation of red blood 

25 cells would result which can prevent or reduce the severity of the leukemia. The 

development of an assay that reports the activation of mammalian target genes allows the 
identification of modulators of other viral or pathogenic dependent pathways. These 
modulators can be used as therapeutic agents. 
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A general procedure for establishing this assay uses the virus or an isolated viral 
protein as the stimulus for modulating a pathway. First, a gene-trapping library is made 
using a cell line that can be infected by the virus or activated by the viral protein. The 
virus is added to these cells, and clones are isolated that responded specifically to the 
5 viral infection by the expression of a reporter gene. 

As an example, the GP-120 portion of HIV protein is known to have mitogenic 
effect on cells exposed to GP-120, which indicates that downstream signaling pathways 
are being activated that can be associated with the cytotoxicity of the vims and allow its 
proliferation. Cell clones can be isolated that are induced by this activation which can be 

10 used to screen for modulators of this cytotoxic or proliferative effect. Other viral 

proteins, such as NEF from HIV, can be used. Chemicals that inhibit this effect can have 
useful therapeutic value to treat viral infection or toxicity. 

This approach can be applied to any cellular pathogen that has an effect on a 
target cells, such as cytotoxicity, cell proliferation, inflammation or other responses. 

15 Other etiological targets include other viruses, such as retroviruses, adenovirus, 

papillomavirus, herpesviruses, cytomegalovirus, adeno associated viruses, hepatitis 
viruses, and any other virus. In addition to viruses, any other pathogen, such as parasites, 
bacteria, and viroids, can be used in the present invention. Particular viral targets include, 
but are not limited to, NEF, Hepatitis X protein, and other viral proteins, such as those 

20 that can be encoded or carried by a virus. In addition, two or more viral components can 
be added to identify coviral pathogensis components. This is a particularly valuable tool 
for identifying pathways modulated by two or more viruses concurrently, or over time as 
in slow activating viral conditions. For example, cotransfection with HIV and CMV may 
be used. Viral targets or components do not include oncogenes or proto-oncogenes found 

25 in uninfected genomes, and gene products thereof. 

Screening Test Chemicals Using Portions Of The Genome 

Cells comprising P-lactamase polynucleotides integrated in the genome can be 
contacted with test chemicals or modulators of a biological process and screened for 
activity. Usually, the test chemical being screened will have at least one defined target, 
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usually a protein. The test chemical is normally applied to the cells to achieve a final 
predetermined concentration in the medium bathing the cells. Typically, screens are 
conducted at concentrations 100 uM or less, preferably 10 uM or less and preferably 1 
uM or less for confirmatory screens. As described more fully herein, ceils can be 
5 subjected to multiple rounds of screening and selection using the same chemical in each 
round to insure the identification of clones with the desired response to a chemical or 
with different chemicals to characterize which chemicals produce a response (either an 
increase or decrease P-lactamase activity) in the cells. Such methods can be applied to 
any chemical that alters the function of any the proteins mentioned herein or known in the 
10 art. 

Chemicals and physiological processes without a defined target, however, can 
also be used and screened with the cells of the invention. For example, once a clone is 
identified as containing an active genomic polynucleotide that is activated by a particular 
cellular signal (including extracellular signals), for instance by a neurotransmitter, that 

15 same clone can be screened with chemicals lacking a defined target to determine if 
activation by the neurotransmitter is blocked or enhanced by the chemical. This is a 
particularly useful method for finding therapeutic targets downstream of receptor 
activation (in this case a neurotransmitter). Such methods can be applied to any chemical 
that alters the function of any the proteins mentioned herein or known in the art. This 

20 type of "targetless" assay is particular useful as a screening tool for the medial conditions 
and pathways described herein. 

The methods and compositions described herein offer a number of advantages 
over the prior art. For instance, screening of mammalian based gene integration libraries 
is limited by the use of existing reporter systems. Many enzymatic reporter genes, such 

25 as secreted-alkaline phosphatase, and luciferase, cannot be used to assay single living 
cells (including FACS) because the assay requires cell lysis to determine reporter gene 
activity. Alternatively, P-galactosidase can detect expression in single cells but substrate 
loading requires permeabilization of cells, which can cause deleterious effects on normal 
cell functions. Additionally, the properties of fluorescent P-galactosidase substrates, such 
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as {luoroscein di-p-D-galactopyranside, and products make it very difficult to screen 
large libraries for both expressing and non-expressing cells because the substrate and 
product is not well retained or permits ratiometric analysis to determine the amount 
uncleaved substrate. Green fluorescent protein (GFP), a non-enzymatic reporter, could be 

5 used to detect expression in single living cells but has limited sensitivity. GFP 

expression level would have to be at least 100,000 molecules per cell to be detectable in a 
screening format and small changes in, or low levels of, gene expression could not be 
measured. Furthermore GFP is relatively stable and would not be suitable for measuring 
down-regulation of genes. Other advantages of the invention are described herein or 

10 readily recognized by one skilled in the art upon reviewing this disclosure. 

Methods for Rapidly Identifying Modulators of Genomic Polynucleotides 

The invention provides for a method of identifying proteins or chemicals that 
directly or indirectly modulate a genomic polynucleotide. Generally, the method 

15 comprises inserting a P-lactamase expression construct into an eukaryotic genome, 
usually non-yeast, contained in at least one living cell, contacting the cell with a 
predetermined concentration of a modulator, and detecting P-lactamase activity in the 
cell. Preferably, cleavage of a membrane permeant P-lactamase substrate is measured 
and the membrane permeant P-lactamase substrate is transformed in the cell into a 

20 trapped substrate. Preferably, the P-lactamase expression construct comprises a p- 
lactamase polynucleotide, a splice donor, a splice acceptor and an IRES element. The 
method can also include determining the coding nucleic acid sequence of a 
polynucleotide operably linked to the P-lactamase expression construct using techniques 
known in the art, such as RACE. 

25 Modulator Identification 

Modulators described herein can be used in this system to test for an increase or 
decrease in p-lactamase activity in successfully integrated clones. Such cells can 
optionally include specific proteins of interest as discussed herein. For example, the ceil 
can include a protein or receptor that is known to bind the modulator (e.g., a nuclear 
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receptor or receptor having a transmembrane domain heterologous))/ or homologous!)' 
expressed by the cell). A second modulator can be added either simultaneously or 
sequentially to the cell or cells and P-lactamase activity can be measured before, dunng 
or after such additions. Cells can be separated on the basis of their response to the 
modulator (e.g. responsive or non-responsive) and can be characterized with a number of 
different modulators to create a profile of cell activation or inhibition. 

|3-]actamase activity will often be measured in relation to a reference sample, 
often a control. For example, P-lactamase activity is measured in the presence of the 
modulator and compared to the P-lactamase activity in the absence of the modulator or 
possibly a second modulator. Alternatively, p-lactamase activity is measured from a cell 
expressing a protein of interest and to a cell not expressing the protein of interest (usually 
the same cell type). For instance, a modulator may be known to bind to a receptor 
expressed by the cell and the P-lacumase activity in the cell is increased in the presence 
of the modulator compared to the p-lactamase activity detected from a corresponding cell 
in the presence of the modulator, wherein the corresponding cell does not express the 
receptor. 

Pathway Identification and Modulators 

When a reporter gene of the invention integrates into the genome of a host cell 
such that the reporter gene is expressed under a variety of circumstances, these clones can 
be used for drug discovery and functional genomics. These clones report the modulation 
of the reporter gene in response to a variety of stimuli, such as hormones and other 
physiological signals. These stimuli can be involved in a variety of known or unknown 
pathways that are modulated by known or unknown modulators or targets. Thus, these 
clones can be used as a tool to discover chemicals that modulate a particular pathway or 
to determine a cellular pathway. 

These pathways are quite varied, and fall into general classes, which have specific 
species, which can be modulated by known or unknown modulators or agonists or 
antagonists thereof. By way of example, Table 1 illustrates various pathways, species of 
these classes, and known modulators of these species. The invention can be used to 
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identify regions of the genome that are modulated by such pathways, or physiological 
event 



TABLE 1 
Pathways and modulators 



Pathway/Physiological Event 




Genus 


Species 


Known Modulator 


Nuclear receptors 


Estrogen receptor 


Estrogen 


Cytokines 


IL-2 receptor 


IL-2 


GPCRs 


Vasopressin receptor 


Vasopressin 


Transcription factors 


Fos or Jun 


NFAT 


Kinase dependent 


Protein kinase C 


PMA 


Phosphatase dependent 


Calcmeunne 


Cyclosporin A 


Protease dependent 


Metalioproleise 


TIMPs 


Chemokine 


CCR1 


RANTES 


Ion channels 


Calcium channels 


Many known blockers 


Second messenger 
dependent 


Cyclic AMP 


CAMP inhibitor protein 


Ceil differentiation 


Hematopoeitic 
development 


EPO 


( Cell growth 


IL-2 receptor 


IL-2 


Cell cycie dependent 


CDK 


P21 


Apopiosis 


Fas 


P53 



In one embodiment, the invention provides for a genomic assay system to identify 
downstream transcriptional targets for signaling pathways. This method requires the 
target of interest to activate gene expression upon addition of chemical or expression of 
the target protein. A cell line that is the most similar to the tissue type where the target 
functions is preferred for generating a library of clones with different integration sites 
with (3-lactamase polynucleotides or other reporter genes. This cell line may be known to* 
elicit a cellular response, such as differentiation upon addition of a particular modulator. 
If this type of cell line is available, it is preferred for screening, as it represents the native 
context of the target. If a cell line is not available that homologously expresses the target, 
a cell line can be generated by heterologously expressing the target in the most relevant 
cell line. For instance, if the target is normally expressed in the lymphoid cells, then a 
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lymphoid cell line would be used generate the library. 

The library of clones, as described further herein, can be separated into two pools 
by FACS using the FRET system described herein: an expressing pool (e.g. blue cells) 
and a non-expressing pool (e.g. green cells). These two pools can then be treated with a 

5 modulator followed by FACS to isolate induced clones (e.g. green to blue) or repressed 
clones (e.g. blue to green). Additional rounds of stimulation followed by FACS can be 
performed to verify initial results. The specificity of activation can be tested by adding 
additional chemicals thai would not activate the defined target. This would allow the 
identification of clones that have P-lactamase polynucleotides integrated into genes 

10 activated by a vanety of cellular signals. 

Once a pool of cells with the desired characteristics are isolated they can be 
expanded and their corresponding genes cloned and characterized. Targets which could 
be used in this assay system include receptors, kinases, protein/protein interactions or 
transcription factors and other proteins of interest discussed herein. 

15 In another embodiment, the invention provides for a method of identifying 

developmentally or tissue specific expressed genes. P-lactamase polynucleotide can be 
inserted, usually randomly, into any precursor cell such as an embryonic or hematopoetic 
stem cell to create a library of clones. Constitutiveiy expressing clones can be collected 
by sorting for blue cells and non-expressing cells collected by sorting for green cells 

20 using the FRET system described herein. The library of clones can then be stimulated or 
allowed to differentiate, and induced or repressed clones isolated. Cell surface markers in 
conjunction with fluorescent tagged antibodies or other detector molecules could be used 
to monitor the expression of reference genes simultaneously. Additionally, by 
stimulation and sorting stem cells at various developmental stages, it is possible rapidly 

25 identify genes responsible for maturation and differentiation of particular tissues. 

Additionally, clones that have a p-lactamase polynucleotide integrated, either 
randomly or by homolgous recombination, into developmentally expressed genes can be 
used with FACS to isolate specific cell populations for further study, such as screening. 
Such methods can be used for identifying cell populations that have stem cells properties, 
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as well as providing an intracellular reporter that allows isolation and screening of such a 
population of cells. 

The present invention can yield screening cell lines for a variety of targets whose 
downstream signaling elements are already known or postulated. These screening cell 
lines can be used to either screen for modulators of transfected targets or as readouts for 
expression cloning or functional analysis of uncharacterized targets. Screening cell lines 
can be made for any pathway or any modulator, such those described in Table 1 . 

In the case of ion channels, cell lines are generated in which P-lactamase 
expression is used lo detect a voltage change. This is possible because intracellular 
signaling is sensitive to membrane potential and will modulate the expression of a subset 
of genes. In one example, a library of neuronal cells prepared following the general 
methods set forth in Examples 1 to 13, such as a dorsal root neuroblastoma cells, be 
screened for a response to a depolanzation by incubating cells in high potassium (high 
fC) medium. Depending on the particular characteristics of the cell library and the 
method used T clones with a transcriptional response to a depolarizing treatment are 
identified by sorting for cells which changed from either green to blue or blue to green 
after depolarization. These clones are designated as voltage-sensitive clones and can be 
used as screening cell lines to identify chemicals that modulate ion channels (either 
endogenously expressed or transfected) which cause a voltage change upon either 
activation or inhibition (e.g. K* or Na* channels). These cells are also useful for 
expression cloning of ion channels. For example, a voltage-sensitive clone could be 
transfected with a cDNA library. Those cells transfected with functional channels that 
shift the membrane potential are detected via beta-lactamase and the cDNA gene products 
are analyzed for activity as ion channels. 

Furthermore, a gene encoding a known ion channel can be transfected into the 
voltage sensitive cell line and then used as a screen for channel modulators. For example, 
expression or pharmacological activation of a Na* channel can cause a depolarization that 
can be reported by the cell line. This cell line can be used to screen for agonists or 
antagonists, depending on the experimental protocol of ion channel modulators. In a 
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variation of this approach, a genomic library from a cell line lacking K* channels, such as 
L929 cells, can be directly iransfected with a K* channel gene. The expression of the K* 
channel causes a voltage shift, such as a hyperpolarization, causing a change in 
expression of certain voltage-sensitive genes. The clones expressing these genes can be 

5 used to screen for regulators of the ion channel. 

Orphan protein signaling pathway identification and orphan protein modulators 

In another embodiment, the invention provides for a method of identifying 
modulators of orphan proteins or genomic polynucleotides that are directly or indirectly 
modulated by an orphan protein. Human disease genes are often identified and found to 

10 show little or no sequence homology to functionally characterized genes. Such genes are 
often of unknown function and thus encode for an "orphan protein/* Usually such orphan 
proteins share less than 25% amino acid sequence homology with other known proteins 
or are not considered part of a gene family. With such molecules there is usually no 
therapeutic starting point. By using libraries of the herein described clones, one can 

l 5 extract functional information about these novel genes. 

Orphan proteins can be expressed, preferably overexpressed, in living mammalian 
cells. By inducing over expression of the orphan gene and monitoring the effect on 
specific clones one may identify genes that are transcriptionally regulated by the orphan 
protein. By identifying genes whose expression is influenced by the novel disease gene 

20 or other orphan protein one may predict the physiological bases of the disease or function 
of the orphan molecule. Insights gained using this method can lead to identification of a 
valid therapeutic target for disease intervention. 

Modulator Identification using Genomic Polynucleotides Activated by Cellular Signals 
In another embodiment, the invention provides for a method of screening a 
25 defined target or modulator using genomic polynucleotides identified with the methods 

described herein. The gene identification methods described herein can also be used in 

conjunction with a screening system for any target that functions (either naturally or 

artificially) through transcriptional regulation. 

In many instances a receptor and its ligand are known but not the downstream 
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bioiogical processes required for signaling For example, a cytokine receptor and 
cytokine may be known but the downstream signaling mechanism is not. A library of 
clones generated from a cell line that expresses the cytokine receptor can be screened to 
identify clones showing changes in gene expression when stimulated by the cytokine. 
5 The induced genes could be characterized to describe the signaling pathway. Using the 
methods of the invention, gene characterization is not required for screen development, as 
identification of a cell clone that specifically responds to the cytokine constitutes a usable 
secondary screen Therefore, clones that show activation or deactivation upon the 
addition of the cytokine can be expanded and used to screen for agonists or antagonists of 

10 cytokine receptor. The advantage of this type of screening is that it does not require an 
initial understanding of the signaling pathway and is therefore uniquely capable of 
identifying leads for novel pathways. 

In another embodiment, the invention provides for a method of functionally 
characterizing a target using a panel of clones having active genomic polynucleotides as 

15 identified herein. As large numbers of specifically responding cell lines containing active 
genomic polynucleotides identified with a particular biological process or modulator are 
generated, panels containing specific clones can be used for functional analysis of other 
potential cellular modulators. These panels of responding cell lines can be used to 
rapidly profile potential transcriptional regulators. Such panels, as well as containing 

20 clones with identified active genomic polynucleotides, which were generated by the 

invention panels, can include clones generated by more traditional methods. Clones can 
be generated that contain both the identified active genomic polynucleotide with a (J- 
lactamase polynucleotide and specific response elements, such as SRJE, CRE, NFAT, 
TRE, IRE, or reporters under the control of specific promoters. These panels would 

25 therefore allow the rapid analysis of potential effectors and their mechanisms of cellular 
activation. A second reporter (e.g. ^-galactosidase gene can also be used with this 
method, as well as the other method described herein. 

In another embodiment, the invention provides for a method of test chemical 
profiling using a clone or panel of clones having identified active polynucleotides. Test 
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chemical characterization is similar to target characterization except that the cellular 
target(s) do not have to be known. This method will therefore allow the analysis of test 
chemical (e.g. lead drugs) effects on cellular function by defining genes effected by the 
drug or drug lead. 

5 Such a method can find application in the area of drug discovery and secondary 

affects (e.g. cytotoxic affect) of drugs. The potential drug would be added to a library of 
genomic clones and clones which either were induced or repressed would be isolated, or 
identified. This method is analogous to target characterization except that the secondary 
drug target is unknown. As well as providing a screen for the secondary effects, the assay 

10 provides information on the mechanism of toxicity. 

Methods Related to FA CS and Identifying Active Genomic Polynucleotides 
The invention provides for a method of identifying active genomic 
polynucleotides using clones having integrated P-lactamase polynucleotides and FACS. 
P-lactamase integration libraries can be used in a high-throughput screening format, such 

15 as FACS, to detect transcriptional regulation. The compatibility of P-lactamase assays 
with FACS enables a systematic method for defining patterns of transcriptional regulation 
mediated by a range of factors. This approach has not been feasible or practical using 
existing reporter systems. This new method will allow rapid identification of genes 
responding to a variety of signals, including tissue specific expression and during pattern 

20 formation. 

For example, after integration of a P-lactamase polynucleotide, expressing and 
non-expressing cells can be separated by FACS. These two cell populations can be 
treated with potential modulators and changes in gene expression can be monitored Using 
ratio-metric fluorescent readout. Pools of clones will be isolated that show either up- or 
25 down-regulation of reporter gene expression. Target genes from responding clones can 
then be identified. In addition, by being able to separate expressing and non-expressing 
cells at different time points after modulator addition, genes which are differentially 
regulated over time can be identified. This approach therefore enables the elucidation of 
transcription cascades mediated by cellular signaling. Specifically, it will provide a 
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means to identify downstream genes which are transcriptionally regulated by a variety of 
molecules including, nuclear receptors, cytokine receptors or transcription factors. 

Applications of this technology are nearly unlimited in the areas of gene discovery 
and functional analysis. Libraries of cell lines from various tissue types could be 
f generated and used to identify genes with specific expression patterns or regulation 
mechanisms. These libraries of clones would represent millions of integration sites 
saturating the genome and can permit the identification of any expressed gene based on 
its transcriptional regulation. The features of the p-lactamase reporter system, in part, 
allow its use for this genomic integration assay in a high-throughput format 

10 There are a variety of other approaches that may be used with the invention, 

including approaches similar to those proposed for P-lactamase. Examples would include 
antibody epitopes presented on the cell surface with fluorescent antibodies to detect 
positive cells. Gel matrixes could also be used which retain secreted reporters and allow 
detection of positive cells. These approaches would, however, be limited in sensitivity 

15 and would not be ratiometric in their detection. They would therefore allow for only the 
sorting of positive cells based on fluorescent intensity. 

Once active genomic polynculeotides have been identified, they can be sequenced 
using various methods, including RACE (rapid amplification of cDNA ends). RACE is a 
procedure for the identification of unknown mRNA sequences that flank known mRNA 

20 sequences. Both 5' and 3' ends can be identified depending on the RACE conditions. 5' 
RACE is done by first preparing RNA from a cell line or tissue of interest. This total or 
polyA RNA is then used as a template for a reverse transcription reactions which can 
either be random pnmed or primed with a gene-specific primer. A poly nucleotide linker 
of known sequence is then attached to the 3' end of the newly transcribed cDNA by 

25 terminal transferase or RNA ligase. This cDNA is then used as the template for PCR 

using one primer within the reporter gene and the other primer corresponding to sequence 
which had been linked to the 3' end of the first stand cDNA. The present invention is 
particularly well suited for such techniques and does not require construction of 
additional clones or constructs once the genomic polynucleotide has been identified. 
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Substrates for Measuring B-lactamase Activity 

Any membrane permanent P-lactamase substrate capable of being measured 
inside the cell after cleavage can be used in the methods and compositions of the 
invention. Membrane permanent P-lactamase substrates will not require pcrmeablizing 
5 eukaryotic cells either by hypotonic shock or by electroporation. Generally, such non- 
specific pore forming methods are not desirable to use in eukaryotic cells because such 
methods injure the cells, thereby decreasing viability and introducing additional variables 
into the screening assay (such as loss of ionic and biological contents of the shocked or 
porated cells). Such methods can be used in cells with cell walls or membranes that 

10 significantly prevent or retard the diffusion of such substrates. Preferably, the membrane 
permeant p-lactamase substrates are transformed in the cell into a P-lactamase substrate 
of reduced membrane permeability (usually at least five less permeable) or that is 
membrane impermeant. Transformation mside the cell can occur via intracellular 
enzymes (e.g. esterases) or intracellular metabolites or organic molecules (e.g. sulfhydryl 

15 groups). Preferably, such substrates are fluorescent. Fluorescent substrates include those 
capable of changes, either individually or in combination, of total fluorescence, excitation 
or emission spectra or FRET. 

Preferably, FRET type substrates are employed with the methods and 
compositions of the invention. Including fluorogenic substrates of the general formula I: 

20 D-S-A 

wherein D is a FRET donor and A is a FRET acceptor and S is a substrate for a protein 
with P-lactamase activity. P-lactamase activity cleaves either D-S or S-A bonds thereby 
releasing either D or A, respectively from S Such cleavage resulting from P-lactamase 
activity dramatically increases the distance between D and A which usually causes a 
25 complete loss in energy transfer between D and A. Generally, molecules of D-S-A 

structure are constructed to maximize the energy transfer between D and A. Preferably, 
the distance between D and A is generally equal to or less than the 

As would readily be appreciated by those skilled in the art, the efficiency of 
fluorescence resonance energy transfer depends on the fluorescence quantum yield of the 
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donor fluorophore, the donor-acceptor distance and the overlap integral of donor 
fluorescence emission and acceptor absorption. The energy transfer is most efficient 
when a donor fluorophore with high fluorescence quantum yield (preferably, one 
approaching 100%) is paired with an acceptor with a large extinction coefficient at 
5 wavelengths coinciding with the emission of the donor. The dependence of fluorescence 
energy transfer on the above parameters has been reported Forster, T. (1 948) Ann. Physik 
2: 55-75; Lakowicz, J. R. ( Principles of Fluorescence Spectroscopy, New York: Plenum 
Press (1983); Herman, B., Resonance energy transfer microscopy, in: Fluorescence 
Microscopy of Living Cells in Culture, Part B, Methods in Cell Biology, Vol. 30, ed. 

10 Taylor, D.L. & Wang. Y.L., San Diego: Academic Press (1989), pp. 219-243; Turro, N. 
J., Modern Molecular Photochemistry, Menlo Part: Benjamin/Cummings Publishing Co., 
Inc. (1978), pp. 296-361, and tables of spectral overlap integrals are readily available to 
those working in the field for example, Beriman, I.B. Energy transfer parameters of 
aromatic compounds, Academic Press, New York and London (1973). The distance 

15 between donor fluorophore and acceptor dye at which fluorescence resonance energy 
transfer (FRET) occurs with 50% efficiency is termed R<, and can be calculated from the 
spectral overlap integrals. For the donor-acceptor pair fluorescein - tetramethyl 
rhodamine which is frequently used for distance measurement in proteins, this distance Rq 
is around 50-70 A dos Remedios, C.G. et al. (1987) J. Muscle Research and Cell Motility 

20 8:97-1 1 7. The distance at which the energy transfer in this pair exceeds 90% is about 45 
A. When attached to the cephalosporin backbone the distances between donors and 
acceptors are in the range of 10 A to 20 A, depending on the linkers used and the size of 
the chromophores. For a distance of 20 A, a chromophore pair will have to have a 
calculated R<, of larger than 30 A for 90% of the donors to transfer their energy to the 

25 acceptor, resulting in better than 90% quenching of the donor fluorescence. Cleavage of 
such a cephalosporin by£r lactamase relieves quenching and produces an increase in donor 
fluorescence efficiency in excess of tenfold. Accordingly, it is apparent that 
identification of appropriate donor-acceptor pairs for use as taught herein in accordance 
with the present invention would be essentially routine to one skilled in the art. 
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Reporting gene substrates described in Tsien et a!., PCT Publication No. 
WO96/30540 published October 3, 1996 are preferred forjblactamase. 
Fluorescence Measurements 

When using fluorescent substrates, it will recognized that different types of 
5 fluorescent monitoring systems can be used to practice the invention. Preferably, FACS 
systems are used or systems dedicated to high throughput screening e.g., 96 well or 
greater microtiter plates. Methods of performing assays on fluorescent materials are well 
known in the art and are described in, e.g., Lakowicz, J. R., Principles of Fluorescence 
Spectroscopy, New York: Plenum Press (1983); rjerman, B. t Resonance energy transfer 

10 microscopy, in: Fluorescence Microscopy of Livtkg Cells in Culture, Part B. Methods in 
Cell Biology, vol. 30, ed. Taylor, D.L. & Wang, Y. L., San Diego: Academic Press 
(1989), pp. 219-243; Turro, N. J., Modern Molecular Photochemistry, Menlo Park: 
Benjarnin/Cummings Publishing Col, Inc. (1978), pp. 296-361. 

Fluorescence in a sample can be measured using a fluorimeter. In general, 

15 excitation radiation, from an excitation source having a first wavelength, passes through 
excitation optics. The excitation optics cause the excitation radiation to excite the 
sample. In response, fluorescent proteins in the sample emit radiation that has a 
wavelength that is different from the excitation wavelength. Collection optics then 
collect the emission from the sample. The device can include a temperature controller to 

20 maintain the sample at a specific temperature while it is being scanned. According to one 
embodiment, a multi-axis translation stage movjs-a microtiter plate holding a plurality of 
samples in order to position different wells to be exposed. The multi-axis translation 
stage, temperature controller, auto-focusing feature, and electronics associated with' 
imaging and data collection can be managed by an appropriately programmed digital 

25 computer. The computer also can transform the data collected during the assay into 
another format for presentation. 

Preferably, FRET is used as a way of monitoring P-lactamase activity inside a 
cell. The degree of FRET can be determined by any spectral or fluorescence lifetime 
characteristic of the excited construct, for example, by determining the intensity of the 
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fluorescent signal from the donor, the intensity of fluorescent signal from the acceptor, 
the ratio of the fluorescence amplitudes near the acceptor's emission maxima to the 
fluorescence amplitudes near the donor's emission maximum, or the excited state lifetime 
of the donor. For example, cleavage of the linker increases the intensity of fluorescence 
5 from the donor, decreases the intensity of fluorescence from the acceptor, decreases the 
ratio of fluorescence amplitudes from the acceptor to that from the donor, and increases 
the excited state lifetime of the donor. 

Preferably, changes in the degree of FRET are determined as a function of the 
change in the ratio of the amount of fluorescence from the donor and acceptor moieties, a 

10 process referred to as "ratioing." Changes in the absolute amount of substrate, excitation 
intensity, and turbidity or other background absorbances in the sample at the excitation 
wavelength affect the intensities of fluorescence from both the donor and acceptor 
approximately in parallel. Therefore the ratio of the two emission intensities is a more 
robust and preferred measure of cleavage than either intensity alone. 

15 The excitation state lifetime of the donor moiety is, likewise, independent of the 

absolute amount of substrate, excitation intensity, or turbidity or other background 
absorbances. Its measurement requires equipment with nanosecond time resolution, 
except in the special case of lanthanide complexes in which case microsecond to 
millisecond resolution is sufficient. 

20 The ratio-metric fluorescent reporter system described herein has significant 

advantages over existing reporters for gene integration analysis, as it allows sensitive 
detection and isolation of both expressing and non-expressing single living cells. This 
assay system uses a non-toxic, non-polar fluorescent substrate that is easily loaded and 
then trapped intracellularly. Cleavage of the fluorescent substrate by P-lactamase yields a 

25 fluorescent emission shift as substrate is converted to product. Because the p-lactamase 
reporter readout is ratiometnc it is unique among reporter gene assays in that it controls 
for variables such as the amount of substrate loaded into individual cells. The stable, 
easily detected, intracellular readout eliminates the need for establishing clonal cell lines 
prior to expression analysis With the P-lactamase reporter system or other analogous 
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systems flow sorting can be used to isolate both expressing and non-expressing cells from 
pools of millions of viable cells. This positive and negative selection allows its use with 
gene identification methods to isolate desired clones from large clone pools containing 
millions of cells each containing a unique integration site. 
5 High Throughput Screening System 

The present invention can be used with systems and methods that utilize 
automated and integratable workstations for identifying modulators, pathways, chemicals 
having useful activity and other methods described herein. Such systems are described 
generally in the art (see, U.S. Patent Nos: 4,000,976 to Kramer et al. (issued January 4, 

10 1977), 5,104,621 to Pfost et al. (issued April 14, 1992), 5,125,748 to Bjomson et al. 
(issued June 30, 1992), 5,139,744 to Kowalski (issued August 18, 1992), 5,206,568 
Bjornson et al. (issued April 27, 1993), 5,350,564 to Mazza et al. (September 27, 1994), 
5,589,351 to Harootunian (issued December 31, 1996), and PCT Application Nos: WO 
93/20612 to Baxter Deutschland GMBH (published October 14, 1993), WO 96/05488 to 

15 McNeil et al. (published February 22, 1996) and WO 93/13423 to Agong et al. (published 
July S, 1993). 

Typically, such a system includes: A) a storage and retrieval module comprising 
storage locations for storing a plurality of chemicals in solution in addressable wells, a 
well retnevcr and having programmable selection and retrieval of the addressable wells 

20 and having a storage capacity for at least 10,000 the addressable wells, B) a sample 
distribution module comprising a liquid handler to aspirate or dispense solutions from 
selected the addressable wells, the chemical distribution module having programmable 
selection of, and aspiration from, the selected addressable wells and programmable 
dispensation into selected addressable wells (including dispensation into arrays of 

25 addressable wells with different densities of addressable wells per centimeter squared), C) 
a sample transporter to transport the selected addressable wells to the sample distribution 
module and optionally having programmable control of transport of the selected 
addressable wells (including adaptive routing and parallel processing), D) a reaction 
module comprising either a reagent dispenser to dispense reagents into the selected 
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addressable wclis or a fluorescent detector to detect chemical reactions in the selected 
addressable wells, and. a data processing and integration module. The addressable wells 
should be made of biocompatable materials that are also compatible with the assay to be 
performed (see, U.S. Patent Application Attorney Docket No.: 08366/008001 /'Systems 
5 and methods for rapidly identifying useful chemicals in liquid samples" (Stylli et al., filed 
May 16, 1997), which is incorporated herein by reference. 

The storage and retrieval module, the sample distribution module, and the reaction 
module are integrated and programmably controlled by the data processing and 
integration module. The storage and retrieval module, the sample distribution module, 

10 the sample transporter, the reaction module and the data processing and integration 

module are operably linked to facilitate rapid processing of the addressable sample wells. 
Typically, devices of the invention can process about 10,000 to 100,000 addressable 
wells, which can represent about 5,000 to 100,000 chemicals, in 24-hour penod. Cells 
clones generated using the present invention can be individually deposited into wells of a 

15 multi-well platform having any number of wells, such as 96, 864, 3456, or more. The 
cells in the wells can be cultured, stored, screened, and inventoried using such a system. 

The present invention is also directed to chemical entities and information (e.g., 
modulators or chemicals or databases biological activities of chemicals or targets) 
generated or discovered by operation of the present invention, particularly chemicals and 

20 information generated using such systems. 

Pharmacology and Toxicity of Candidate Modulators 

The structure of a candidate modulator identified by the invention can be 
determined or confirmed by methods known in the art, such as mass spectroscopy. Tor 
putative modulators stored for extended periods of time, the structure, activity, and 

25 potency of the putative modulator can be confirmed. 

Depending on the system used to identify a candidate modulator, the candidate 
modulator will have putative pharmacological activity. For example, if the candidate 
modulator is found to inhibit T-cei) proliferation (activation) in vitro, then the candidate 
modulator would have presumptive pharmacological properties as an immunosuppressant 
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or ann-inflammatory (see, Suthanlhiran et al.. Am. J. Kidnev Disease . 28:159-172 
(1996)). Such nexuses are known in the art for several disease states, and more are 
expected to be discovered over time. Based on such nexuses, appropriate confirmatory in 
vitro and in vivo models of pharmacological activity, as well as toxicology, can be 

5 selected. The methods described herein can also be used to assess pharmacological 
selectivity and specificity, and toxicity. 

Once identified, candidate modulators can be evaluated for toxicological effects 
using known methods (see, Lu, Basic Toxicology, Fundamentals, Target Organs, and 
Risk Assessment, Hemisphere Publishing Corp., Washington (1985); U.S. Patent Nos: 

10 5,196,313 to Culbreth (issued March 23, 1993) and U.S. Patent No. 5,567,952 to Bcnet 
(issued October 22, 1996). For example, toxicology of a candidate modulator can be 
established by determining in vitro toxicity towards a cell line, such as a mammalian i.e. 
human, cell line. Candidate modulators can be treated with, for example, tissue extracts, 
such as preparations of liver, such as microsomal preparations, to determine increased or 

15 decreased toxicological properties of the chemical after being metabolized by a whole 
organism. The results of these types of studies are often predictive of toxicological 
properties of chemicals in animals, such as mammals, including humans. 

Alternatively, or in addition to these in vitro studies, the toxicological properties 
of a candidate modulator in an animal model, such as mice, rats, rabbits, or monkeys, can 

20 be determined using established methods (see, Lu, supra (1985); and Creasey, Drug 
Disposition in Humans, The Basis of Clinical Pharmacology . Oxford University Press, 
Oxford (1 979)). Depending on the toxicity, target organ, tissue, locus, and presumptive 
mechanism of the candidate modulator, the skilled artisan would not be burdened to 
determine appropriate doses, LD 50 values, routes of administration, and regimes that 

25 would be appropriate to determine the toxicological properties of the candidate 
modulator. In addition to animal models, human clinical trials can be performed 
following established procedures, such as those set forth by the United States Food and 
Drug Administration (USFDA) or equivalents of other governments. These toxicity 
studies provide the basis for determining the efficacy of a candidate modulator in vivo. 



WO 98/13353 PCT/US97/m95 

Efficacy of Candida (e Modulators 

Efficacy of a candidate modulator can be established using several art recognized 
methods, such as in vitro methods, animal models, or human clinical trials (see, Creasey, 
supra ( 1 979)). Recognized in vitro models exist for several diseases or conditions. For 
example, the ability of a chemical to extend the life-span of HIV-infected cells in vitro is 
recognized as an acceptable model to identify chemicals expected to be efficacious to 
treat HIV infection or AIDS (see, Daluge et al. t Antimicro. Agents Chemother. 41:1082- 
1093 (1995)). Furthermore, the ability of cyclosporin A (CsA) to prevent proliferation of 
T-cells in vitro has been established as an acceptable model to identify chemicals 
expected to be efficacious as immunosuppressants (see, Suthanthiran et al., supra , 
(1996)). For nearly every class of therapeutic, disease, or condition, an acceptable in 
vitro or animal model is available. Such models exist, for example, for gastro-intestinal 
disorders, cancers, cardiology, neurobiology, and immunology. In addition, these in vitro 
methods can use tissue extracts, such as preparations of liver, such as microsomal 
preparations, to provide a reliable indication of the effects of metabolism on the candidate 
modulator. Similarly, acceptable animal models may be used to establish efficacy of 
chemicals to treat various diseases or conditions. For example, the rabbit knee is an 
accepted model for testing chemicals for efficacy in treating arthritis (see, Shaw and 
Lacv . J. Bone Joint Surg. (Br) 55:197-205 (1973)). Hydrocortisone, which is approved 
for use in humans to tTeat arthritis, is efficacious in this model which confirms the 
validity of this model (see, McDonough, Phvs. Ther. 62:835-839 (1982)). When 
choosing an appropriate model to determine efficacy of a candidate modulator, the skilled 
artisan can be guided by the state of the art to choose an appropriate model, dose, and 
route of administration, regime, and endpoint and as such would not be unduly burdened 

In addition to animal models, human clinical trials can be used to determine the 
efficacy of a candidate modulator in humans. The USFDA, or equivalent governmental 
agencies, have established procedures for such studies. 
Selectivity of Candidate Modulators 
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The in vitro and in vivo methods described above also establish the selectivity of a 
candidate modulator. It is recognized that chemicals can modulate a wide variety of 
biological processes or be selective. Panels of cells based on the present invention can be 
used to determine the specificity of (he candidate modulator. Selectivity is evident, for 
example, in the field of chemotherapy, where the selectivity of a chemical to be toxic 
towards cancerous cells, but not towards non-cancerous cells, is obviously desirable. 
Selective modulators are preferable because they have fewer side effects in the clinical 
setting. The selectivity of a candidate modulator can be established in vitro by testing the 
toxicity and effect of a candidate modulator on a plurality of cell lines that exhibit a 
vanety of cellular pathways and sensitivities. The data obtained from these in vitro 
toxicity studies can be extended animal model studies, including human clinical trials, to 
determine toxicity, efficacy, and selectivity of the candidate modulator 

The selectivity, specificity and toxicology, as well as the general pharmacology, 
of a test chemical can be often improved by generating additional test chemicals based on 
the structure/property relationships of the test chemical originally identified as having 
activity (a "Hit"). Test chemicals identified as having activity can be modified to 
improve various properties, such as affinity, life-time in the blood, toxicology, specificity 
and membrane permeability. Such refined test chemicals can be subjected to additional 
assays as described herein for activity analysis. Methods for generating and analyzing 
such chemicals are known in the art, such as U.S. patent 5,574,656 to Agrafiotis ct al. 
Compositions 

The present invention also encompasses a modulator in a pharmaceutical 
composition comprising a pharmaceutically acceptable carrier prepared for storage and 
subsequent administration, which have a pharmaceutically effective amount of the 
candidate modulator in a pharmaceutically acceptable carrier or diluent. Chemicals 
identified by the methods described herein do not include chemicals publicly available as 
of the filing date of the present application or in the prior art. Acceptable carriers or 
diluents for therapeutic use are well known in the pharmaceutical art, and are described, 
for example, in Remington's Pharmaceutical Sciences . Mack Publishing Co. (Ait. 
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Gennaro edit. 1985). Preservatives, stabilizers, dyes and even flavoring agents may be 
provided in the pharmaceutical composition. For example, sodium benzoate, sorbic acid 
and esters of p-hydroxybenzoic acid may be added as preservatives. In addition, 
antioxidants and suspending agents may be used. 
5 The compositions of the present invention may be formulated and used as tablets, 

capsules or elixirs for oral administration; suppositories for rectal administration; sterile 
solutions, suspensions for injectable administration; and the like. Injectables can be 
prepared in conventional forms either as liquid solutions or suspensions, solid forms 
suitable for solution or suspension in liquid prior to injection, or as emulsions. Suitable 

10 excipients are, for example, water, saline, dextrose, mannitol, lactose, lecithin, albumin, 
sodium glutamate, cysteine hydrochloride, and the like. In addition, if desired, the 
injectable pharmaceutical compositions may contain minor amounts of nontoxic auxiliary 
substances, such as wetting agents, pH buffering agents, and the like. If desired, 
absorption enhancing preparations (e.g., liposomes), may be utilized. 

15 The pharmaceutical^ effective amount of the candidate modulator required as a 

dose will depend on the route of administration, the type of animal being treated, and the 
physical characteristics of the specific animal under consideration. The dose can be 
tailored to achieve a desired effect, but will depend on such factors as weight, diet, 
concurrent medication and other factors which those skilled in the medical arts will 

20 recognize. In practicing the methods of the invention, the pharmaceutical compositions 
can be used alone or in combination with one another, or in combination with other 
therapeutic or diagnostic agents. These products can be utilized in vivo, ordinarily in a 
mammal, preferably in a human, or in vitro. In employing them in vivo, the 
pharmaceutical composition can be administered to the mammal in a variety of ways, 

25 including parenterally, intravenously, subcutaneousiy, intramuscularly, colonicaliy, 

rectally, nasally or mtrapentoneally, employing a variety of dosage forms. Such methods 
may also be applied to testing chemical activity in vivo 

As will be readily apparent to one skilled in the art, the useful m vivo dosage to be 
administered and the particular mode of administration will vary depending upon the age, 
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weight and mammalian species treated, the particular pharmaceutical composition 
employed, and the specific use for which the pharmaceutical composition is employed. 
The determination of effective dosage levels, that is the dosage levels necessary to 
achieve the desired result, can be accomplished by one skilled in the art using routine 

5 methods as discussed above. Typically, human clinical applications of products are 
commenced at lower dosage levels, with dosage level being increased until the desired 
effect is achieved. Alternatively, acceptable in vitro studies can be used to establish 
useful doses and routes of administration of the compositions identified by the present 
methods using established pharmacological methods. 

10 In non-human animal studies, applications of potential pharmaceutical 

compositions are commenced at higher dosage levels, with the dosage being decreased 
until the desired effect is no longer achieved or adverse side effects are reduced or 
disappear. The dosage for the products of the present invention can range broadly 
depending upon the desired affects and the therapeutic indication. Typically, dosages 

15 may be between about 10 ng/kg and 1/g/kg body weight, preferably between about 100 
yug/kg and 10 mg/kg body weight. Administration is preferably oral on a daily basis. 

The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition, (see e.g., Fingl et al., in The 
Pharmacoloeical Basis of Therapeutics . 1975). It should be noted that the attending 

20 physician would know how to and when to terminate, interrupt, or adjust administration 
due to toxicity, organ dysfunction, or other adverse effects. Conversely, the attending 
physician would also know to adjust Treatment to higher levels if the clinical response 
were not adequate (precluding toxicity). The magnitude of an administrated dose in the 
management of the disorder of interest will vary with the seventy of the condition to be 

25 treated and to the route of administration. The severity of the condition may, for 

example, be evaluated, in part, by standard prognostic evaluation methods. Further, the 
dose and perhaps dose frequency, will also vary according to the age, body weight, and 
response of the individual patient, A program comparable to that discussed above may be 
used in veterinary medicine. 
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Depending on the specific conditions being treated, such pharmaceutical 
compositions may be formulated and administered systemically or locally. Techniques 
for formulation and administration may be found in Remington's Pharmaceutical 
Sciences , 1 8th Ed., Mack Publishing Co., Easton, PA (1 990). Suitable routes may 
include oral, rectal, transdermal, vaginal, transmucosal, or intestinal administration; 
parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as 
well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or 
intraocular injections. 

For injection, the pharmaceutical compositions of the invention may be 
formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks' solution, Ringer's solution, or physiological saline buffer. For such transmucosal 
administration, penetrants appropriate to the bamer to be permeated are used in the 
formulation. Such penetrants arc generally known in the art. Use of pharmaceutical^ 
acceptable carriers to formulate the pharmaceutical compositions herein disclosed for the 
practice of the invention into dosages suitable for systemic administration is within the 
scope of the invention. With proper choice of carrier and suitable manufacturing practice, 
the compositions of the present invention, in particular, those formulated as solutions, 
may be administered parenterally, such as by intravenous injection. The pharmaceutical 
compositions can be formulated readily using pharmaceutical^ acceptable carriers well 
known in the art into dosages suitable for oral administration. Such carriers enable the 
chemicals of the invention to be formulated as tablets, pills, capsules, liquids, gels, 
syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated. 

Agents intended to be administered intracellular^ may be administered using 
techniques well known to those of ordinary skill in the art. For example, such agents may 
be encapsulated into liposomes, then administered as described above. All molecules 
present in an aqueous solution at the time of liposome formation are incorporated into the 
aqueous interior. The liposomal contents are both protected from the external micro- 
environment and, because liposomes fuse with cell membranes, are efficiently delivered 
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into the cell cytoplasm. Additionally, due to their hydrophobicity, small organic 
molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
achieve its intended purpose. Determination of the effective amount of a pharmaceutical 
composition is well within the capability of those skilled in the art, especially in light of 
the detailed disclosure provided herein. In addition to the active ingredients, these 
pharmaceutical compositions may contain suitable pharmaceutical ly acceptable earners 
comprising excipients and auxiliaries which facilitate processing of the active chemicals 
into preparations which can be used pharmaceutical^. The preparations formulated for 
oral administration may be in the form of tablets, dragees, capsules, or solutions. The 
pharmaceutical compositions of the present invention may be manufactured in a manner 
that is itself known, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levitating, emulsifying, encapsulating, entrapping, or lyophilizing 
processes. Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active chemicals in water-soluble form. Additionally, suspensions of the 
active chemicals may be prepared as appropriate oily injection suspensions. Suitable 
lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid 
esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions 
may contain substances which increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also 
contain suitable stabilizers or agents that increase the solubility of the chemicals to allow 
for the preparation of highly concentrated solutions. 

Pharmaceutical compositions for oral use can be obtained by combining the active 
chemicals with solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, nee starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 
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hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 
polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or algmic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabtc, talc, 
polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dye-stuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active chemicai doses. 
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EXAMPLES 

Example 1 ^-lactamase Expression Constructs 

To investigate various beta- lactamase expression constructs (BLECs) 
multiple BLECs were constructed and transfected into mammalian cells. 
5 The first of these, BLEC-I was constructed by cloning the cytoplasmic 

form of B-lactamase SEQJD NO. 4 (see Table 1) such that it is functionally 
linked to the En-2 splice acceptor sequence, as shown in FIG. 3. This vector when 
inserted into a genomic intron will result in the generation of a fusion RNA 
between an endogenous target gene and^lactamase ("BL"). BLEC-1 also 

10 contains a bovine growth hormone poly-adenlyation sequence (BGH-polyA) 
downstream of the cytoplasmic Beta-lactamasc, 

BLEC-2 was constructed identically to BLEC-1, except that a polio virus 
internal ribosomal entry site (IRES) sequence was inserted between the En-2 
splice acceptor^rlac:arnase( <l BL"). This eliminates reading frame restrictions and 

15 possible inactivation of beta-lactamase by fusion to an endogenous protein. To 
allow for selection of stable transfectants for BLEC-1 and BLEC-2 a neomycin 
or G41 8 resistance cassette was cloned downstream of the BGH poly-adenylation 
sequence. This cassette consists of a promoter, neomycin resistance gene and an 
SV40 poly- adenylation sequence, as shown in FIG. 3. 

20 Two alternative constructs BLEC-3 and BLEC-4 were constructed similar 

to BLEC-1, and BLEC-2 respectively, except the SV40-poly A was replaced 
with a splice donor sequence. This should enrich for insertion into transcribed 
regions, as it requires the presence of an endogenous splice acceptor and 
polyadenylation sequence downstream of the vector insertion site to generate 

25 G41 8 resistant clones. BLEC-3 and BLEC-4 also use the PGK promoter to drive 
the neomycin resistance gene instead of the human beta-actin promoter. 
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The structure of CCF2-AM (BL substrate) usexi in the experiments below 

is: 



0, 




5 



Table 2 



: SEQ. 
ID NO 


parent -BL gene 
and reference 


modification 


mammalian 
expression vector 


location or 
expression 


0 ] 


Escherichia colt 
RTEM 

Kadonaga et al. 


Signal sequence replaced by: 
ATG AGT 


pMAM-neo 
glucocorticoid- 
inducible 


cytoplasmic 


m 


Escherichia colt 
RTEM 

Kadonaea et al. 


Wild type secreted enzyme 
2 changes in pre-sequence: 
ser 2 arg . ala 23 gly 


pMAM-neo 
glucocorticoid- 
inducible 


secreted 
extracellularly 


#3 


Escherichia colt 
RTEM 


-glob in up stream leader: 
AAGC ITITI GCAGAAGCTCA 
GAATAAACGCAACTTTCCG 
Kozalc sequence: 
GGTACCACCATGG 
signal sequence replaced by: 
ATG GGG 


pCDNA 3 
CMV promotor 

and 

pZEO 

SV40 promotor 


cytoplasmic 


# 4 


Escherichia colt 
RTEM 


Kozak sequence: 

GGTACCACCATGG 

signal sequence replaced by: 

ATG GAC 

(GAC replaces CAT) 


pCDNA3 CMV 
promoter 

AND 

BLECs 


cytoplasmic 


* 5 

\ 


Bacillus 

licheniforrms 749/C 
Neugebauer et al. 


signal sequence removed, 
new N -terminal ATG 


pCDNA 3 
CMV promotor 


cytoplasmic 



10 
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Table 3 
Functional Elements 



Splice acceptor Adapter Reporter gene Reporter Selection Resistant 

gene Promoter Marker 

poly A poly A 



VECTORS 














BLEC-l 


En2-splice 


protein 


SEQ. ID NO. 4 


BGH polyA 


fV-actin promoter 


Neo 




acceptor 


fusion 








polyA 


BLEC-2 


En2-splice 


IRES 


SEQ ID NO. 4 


BGH polyA 


(1-actin promoter 


Neo- 




acceptor 










polyA 


BLEC-3 


En2-splice 


Protein 


SEQ. ID NO 4 


BGH polyA 


PGK promoter 


Neo- 




acceptor 


fusion 








splicc 














donor 


BLEC-4 


En2-sphcc 


IRES 


SEQ. ID NO. 4 


BGH polyA 


PGK promoter 


Neo- 




acceptor 










sphec 



aonor 

5 

Example 2 Libraries of BLEC Clones 

To investigate the function of each of the BLEC vectors they were 
transfected by electroporation into RBL-1 cells and stable clones were selected for 
each of the four BLEC plasmids (see Table 2). Selective media contained 

10 DMEM, 10% fetal bovine senim (FBS) and 400 ug/ml Gencticm (G418). G418 
resistant cell clones were pooled from multiple transfections to generate a library 
of BLEC stable integrated clones. 

This library of BLEC-l integrated clones was loaded with the fluorescent 
substrate of BL (CCF-2-AM) by adding lOuM CCF-2-AM in HBSS contaimng 

15 lOjaM hepes 7. 1 and 1% glucose. After a 1 hour incubation at 22°C cells were 
washed with HBSS and viewed upon excitation with 400nm light using a 435nm 
long pass emission filter. Under these assay conditions 10% of the cells were blue 
fluorescent indicating they were expressing ^-lactamase. This result suggests that 
that BLEC-l construct is functioning as a gene integration vector. 

20 Stable cell lines were also generated by transfecting BLEC-l into CHO- 

Kl and Jurkat cells. Populations of BLEC-l integrated clones from CHO and 
Jurkat cells showed similar results to those obtained with RBL-l clones with 10- 
1 5 % of BLEC integrated cell clones expressing BL as determined by their 
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blue/grccn ratio after loading with CCF-2-AM. This result shows that the BLECs 
function in a variety of cell types including human T-cells (Jurkat), rat basophilic 
leukocytes (RBL), and Chinese hamster ovarian (CHO). 

5 Example 3 Isolating BLEC Clones Expressing&lactamase 



Fluorescent activated cell sorting of multi-clonal populations of RBL- 1 
gene integrated clones was used to identify clones with regulated BL gene 
expression. A BL non-expressing population of cells was isolated by sorting a 
library of BLEC-1 integrated clones generated by transfection of RBL-1 cells as 

10 described in Example 2. 180,000 clones expressing little or no BL were isolated 
by sorting for clones with a low blue/green ratio (Rl population), as shown in 
FIG. 4A. This population of clones was grown for seven days and resorted by 
FACS to test the population's fluorescent properties. FACS analysis of the cell 
clones sorted from Rl shows that most of the cells with a high blue/green ratio 

15 -0.1 % have been removed by one round of sorting for green cells, as shown in 
FIG. 4B. It is also clear that the total population has shifted towards more green 
cells compared to the parent population, as shown in FIG. 4A. There are, 
however, cells w;th a high blue/green ratio showing up in the green sorted 
population. These may represent clones in which the BLEC has integrated into a 

20 differentially regulated gene such as a gene whose expression changes throughout 
the cell cycle. 

The population of RBL-1 clones shown in FIG. 4B was stimulated by 
addition of luM ionomycin for 6 hours and resorted to identify clones which had 
the BLEC integrated into a gene which is inducible by increasing intracellular 
25 calcium. Table 3 below summarizes the results from this experiment. A greater 
percentage of blue clones were present in all three of the blue sub-population (R4, 
R2, R5) in the ionomycin stimulated when compared to the unstimulated 
population. This sorted population represents the following classes of blue cells: 
R4 (highest blue/green ratio (bright blues)), R2 (multicolor blues), and R5 (lower 
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10 



blue/green ratio (icast blue). Additionally, m the ionomycin stimulated 
population there is a decrease in the percent green cells from the unstimulated 
population (R6). This increase in blue clones in the ionomycin stimulated 
population indicates that a sub-population of blue clones have the BLEC inserted 
into a gene which is induced by ionomycin. Individual blue clones were sorted 
from the ionomycin stimulated population and are analyzed for their expression 
profile. 

Table 4 
Sort Window (See FIG. 4) 



Unstimulated % 

luM Ionomycin 
Stimulated % 

Ratio -t-lon/-Ion 



R4 

(blue) 

.11 

.24 



2.2 



R2 

2.39 
3.5 

1.5 



R5 
t 53 

2.5 

1.6 



R6 (green) 

66.23 

61.64 

.9 



In addition to allowing the isolation of cell clones with inducible BL 
expression from large populations of cells, clones can be isolated based their level 
of BL expression. To isolate cells with different levels of BL expressions blue 

15 clones can be sorted after different exposure times to substrate or by their 
blue/green ratio. Cell with a lower blue/green ratio or those requiring longer 
incubation times will represent clones expressing lower levels of BL. This is 
demonstrated by the FACS scan above as clones sorted from the R4 window have 
a higher blue/green ration indicating they are expressing higher levels of BL, cells 

20 sorted from the R5 have a lower blue/green ratio (visually turquoise) indicating 
lower BL expression. Cell sorted from the R3 window which contain all the blue 
cells show variation in blue color from bright blue (high blue/green ratio) to 
turquoise blue (low blue/green ratio). 

To demonstrate that the expression constructs are relatively stable for 

25 sorted clones cells were sorted from R3 (blue population) as shown in FIG. 4A 
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and cultured in (he absence of selective pressure for several weeks. There was 
little change in the percent of blue cells in the cultured population with the percent 
blue being maintained at -90%. This result represents a 10-fold enrichment for 
clones constitutively expressing BL by one round of FA CS selection. 
5 Cells in R6 window have the lowest blue/green ration and appear green 

visually. R6 cell is therefore not expressing BL or are expressing BL below the 
detection limit of our assay. 

Example 4 Stability of BLEC Clones 

10 To further investigate the stability of reporter gene integrations into 

constitutively active genes, single blue clones were sorted from cell clone 
populations generated by rransfecting RBL-1, and CHOK1 with BLEC-1. After 
addition of CCF-2 to the multi-clonal cell population, single blue clones were 
sorted into 96 well microtiler plates. These clones were expanded to 24 well 

15 dishes which took 7-10 days. The cell viability varied between the two cell types 
with 80% of the sorted clones forming colonies for the CHO and 36% for the 
RBL-1 cells. After expansion into a 24 well dishes 20 CHO BLEC-1 stable 
clones were tested for BL expression by addition of CCF-2 -AM. 20/20 of these 
clones expressed BL with the percent blue cells within a clone ranging from 70% 

20 to 99%. This result is consisted with the earlier data presented for RBL-1 in which 
the blue sorted population was tested for BL expression after several weeks of 
non-selective cultunng. There was however a significant differences between 
clones in their blue/green ratio and hence their level of BL expression. This 
suggested that genes with different levels of constitutive expression had been 

25 tagged with the BLEC. Although there was a significant differences in blue color 
between separate clones the blue fluorescence within a clone was consistently 
similar as would be expected in a clonal population. There were however green 
cells within the blue sorted clones, which may indicate that there is some loss of 
the BLEC-1 piasmid integration site when clones are grown up from a single cell. 
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Single clones were expanded and used to make RNA for RACE to identify 
the target gene and DNA for southern analysis. 



Example 5 Isolation of Jurkat BLEC integrated clones that constitutive!) express 
5 beta-lactamase 

Jurkat cells are a T-cell line derived from a human T-cell leukemia. This cell line 
maintains many of the signaling capabilities of primary T-cells and can be activated using 
anti-CD3 antibodies or mitogenic lectins such as phytohemaglutinin (PHA). Wild type 
Jurkat cells were transfected by electroporation with a beta-lactamase trapping construct 

10 (BLEC-l, BLEC-IA, or BLEC-1B see FIG. 3) ( 4< BLEC constructs") that contains a 
gene encoding an beta-lactamase gene that is not under control of a promoter recognized 
by the Jurkat cells and a neomycin resistance gene that can be expressed in Jurkat cells. 
BLEC-l is set forth in FIG. 3. BLEC-1A has a NotI site after the SV40 poly A site. 
This allows the cutting of the insert away form the plasmid backbone. BLEC-l B is the 

15 same as BLEC-1A except that the ATG at the beta-lactamase translation start has been 
changed to ATC This eliminated the translation start site and requires the addition of an 
upstream ATG to produce beta-lactamase. Stable transformants were selected for their 
resistance to 800^»g/ml G418. After 400 separate experiments, a pool of greater than one 
million clones with BLEC insertions was produced. This population of cells is a library 

20 of cell clones in which the BLEC construct inserted throughout the genome ("Jurkat 
BLEC library"). Approximately ten percent of the cells in this library express beta- 
lactamase in the absence of added stimuli. Beta-lactamase activity in the cells was 
determined by contacting the cells with CCF2-AM and loading in the presence of 
Pluronic 128 (from Sigma) at a about lOO^g/ml. Individual clones or populations of 

25 cells that express beta-lactamase can be obtained by FACS sorting. 

Genomic Southern analysis of these clones using a DNA probe encoding beta- 
lactamase showed the vector inserted into the host genome between one and three times 
per cell, with most clones having one or two vector insertion sites (for Genomic Southern 
analyses, see Sambrook, Molecular Cloning. A Laboratory Manual . Cold Spring Harbor 
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Laboratory Press (1989)). Northern analysis of these clones using a DNA probe that 
encodes beta- lactamase showed that the level of expression and message size varied from 
clone to clone (for Northern analysis, see Sambrook, supra . (1989)). This indicated that 
fusion transcripts were being made with different genes functionally tagged with bera- 
lactamase, which allows for the reporter gene to be expressed under the same conditions 
as the endogenous gene. Using appropriate primers, RACE (Gibco BRL) was used to 
isolate the genes linked to the expressed beta-lactamase gene in a subset of these 
constitutively expressing clones. These genes were cloned and sequenced using known 
methods (see, Sambrook, supra . (1989)). These sequences were compared with known 
sequences using established BLAST search techniques. Known sequences that were 
identified included: beta-catenin, moesin, and P-adaptm. Additionally, several novel 
sequences were identified which represent putative genes. 

Example 6 Isolation of Jurkat BLEC integrated clones that show induced 
expression of beta-lactamase upon activation 

Jurkat BLEC integrated clones that exhibit beta-lactamase expression upon 
activation of the Jurkat cells by PHA (PHA induced clones) were isolated by FACS 
sorting a Jurkat BLEC library. These clones represent cells in which the trapping 
construct had integrated into a gene up regulated by PHA (T-cell) activation. Thus, these 
cells report the transcriptional activation of a gene upon cellular activation. Individual 
clones were identified and isolated by FACS using CCF2-AM to detect beta-lactamase 
activity. This clone isolation method, the induced sorting paradigm, used three sequential 
and independent stimulation and sorting protocols. A FACS read out for Jurkat cells that 
don't contain a BLEC construct contacted with CCF2-AM was used as a control. These 
control cells were all green. 

The first sorting procedure isolated a pool of blue (P-lactamase expressing, as 
indicated by contaenng the cells with CCF2-AM) clones which had been pre-stimulated 
for 18 hours with 1 O^g/ml PHA from an unsorted Jurkat BLEC library. This pool 
represented 2.83 % of the original unsorted cell population. This selected pool contained 
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clones that constitutively express beta-lactamase and clones in which the beta-lactamase 
expression was induced by PHA stimulation ("stimulatable clones"). After sorting, this 
pool of clones was cultured in the absence of PHA to allow the cells, in the case of 
stimulatable clones, to expand and return to a resting state (i.e. lacking PHA induced gene 
5 expression). 

The second sorting procedure isolated a pool of green (non-P-lactamase 
expressing, as indicated by contacting the cells with CCF2-AM) cell clones from the first 
sorted pool that had been grown, post-sorting, without PHA stimulation for 7 days. The 
second sorting procedure separates clones that constitutively express beta-lactamase from 

10 cells that express beta-lactamase upon stimulation. This second pool represented 1 1.59% 
of the population of cells pnor to the second sort. This pool of cells was cultured in the 
absence of PHA to amplify the cell number prior to a third sort. 

The third sorting procedure used the same procedure as the first sorting procedure 
and was used to isolate individual cells that express beta-lactamase in response to being 

15 contacted with 1 (^g/ml PHA for 1 8 hours. Single blue clones were sorted individually 
into single wells of 96 well microtiier plates. This three round FACS sorting procedure 
ennched PHA inducible clones about 10,030 fold. 

These isolated clones were expanded and tested for PHA inducibility by 
microscopic inspection with and without PHA stimulation in the presence of CCF2-AM. 

20 A total of fifty-five PHA inducible clones were identified using this procedure. The PHA 
inducibility for these clones ranged from a 1.5 to 40 fold change in the 460/530 ratio as 
compared to unstimulated control cells. Genomic Southern analysis using a DNA probe 
encoding beta-lactamase established that these clones represented 34 independent stable 
vector integration events. A list of clones obtained by the methods of the present 

25 invention and their characteristics is provided below in Table 6 and Table 7. 

In addition to PHA inducible clones, Phobol 12-myristate 13-acetate (PMA) 
(Calbiochem), Thapsigargin (Thaps) (Calbiochem), and PMA + Thaps inducible clones 
were isolated using the general procedure set forth above using the indicated inducer 
rather than PHA. PMA is a specific activator of PKC (protein kinase C) and Thaps is a 
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specific activator of intracellular caicium ion release (Thaps). These clones were isolated 
using three rounds of FACS using the general procedures described for the PHA 
inducible clones in Example 5. In such instances, other stimulants were substituted for 
PHA. PMA was provided at 8 nM, Thaps was provided at l^M. When these two 

5 stimulants were combined, their concentration was not changed. As shown in Table 5, 
clones were selected based on their activation by PMA, Thaps, or PMA with Thaps after 
three or eighteen hours of stimulation ("stimulation time**). These results demonstrate 
that the FACS sorting criteria can be varied depending upon the type of modulated clones 
desired. By using varied selection conditions, it is possible to isolate functionally distinct 

10 clones downstream of the desired signaling target 



Example 7 Isolation of Jurkat BLEC integrated clones that show repressed 
expression of beta-lactamase upon activation 

Jurkat BLEC clones that exhibit decreased beta-lactamase expression upon 

15 activation of the Jurkat cells by PHA were isolated by FACS sorting. These clones 
represent cells in which the BLEC trapping construct had integrated into a gene down 
regulated by PHA (T-cell) activation. Thus, these cells report the transcriptional 
repression of a gene upon cellular activation. Individual clones were identified and 
isolated by FACS using CCF2-AM to detect beta-lactamase activity using the following 

20 repressed sorting paradigm. 

A first sort was used to isolate a population of cells that constitutively express 
beta-lactamase by identifying and isolating a population of blue cells from an 
unstimulated population of BLEC transfected Jurkat cells contacted with CCF2-AM. 
The sorted population of cells represented 2.89% of the unsorted population. These cells 

25 were cultured, divided into two pools, and stimulated with one of two different stimuli, 
either l^g/ml PHA for 1 8 hours, or 8 nM PMA and l^M Thapsigargin for 1 8 hours. 
These stimulated cells were contacted with CCF2 (loading in the presence of 400 PET 
(4% weight/volume) and Pluoronic 128 (100^g/ml)) and the green cells in the population 
were sorted using FACS. The sorted population represented 8.41 % of the cell 
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population prior to the second sort. The third round of FACS was for single blue 
unstimulated cells. The population of cells obtained represented 18.2 % of the cell 
population prior to the third sort. 

This sorting procedure represents a 2,260-fold enrichment for PHA repressible 
clones. These clones have the beta-lactamase gene integrated into a gene that is down 
regulated by PHA stimulation of the cells. Six of 80 individual clones tested were 
repressed by PHA or PMA + Thapsigargin. All of these clones were confirmed to be 
independent integration events by genomic Southern analysis using a DNA probe 
encoding beta-lactamase. The results of these studies are presented in Table 5. 



TABLES 

Identification of trapping cell lines with reporter genes 
expression which is regulated by T-cell activation 



Stimuli (1>om| 


FintSort 
AcitvsClo* 
Chemical and 
Time of 
Eipoiure 


Sit mutation 
Time 


Sortiat 
rarsdlfm 


Ooacs 
Isolated 


Clones wUh Omt or 
Two V actor 
InwrttoiMl) 

1 2 


PHA (lO^g/ml) 


PHA 
18 hours 


IS hours 


Induced 


3d 


24 


10 


PMA (8 r\M)+ Trupj (1 
uM) 


PMA + Thmps 
3 hours 


3 hours 


Induced 


2 


2 


0 


PMA (8nM) 


PMA 
3 hours 


3 hours j Induced 


3 i 2 

1 


1 


Thaps (1^/M) 


Thaps 

3 hour» 


3 hours 


Induced 


i 2 


0 


PHA (10 g/ml) or 
PMA (Bnm) + Thips(1 


No Stimulation 


18 hours 


Repressed 


6 | 5 

1 

1 

i 


1 



Example 8 Specificity of T-cell modulated clones 

Isolated clones from PHA-induced (Example 6) and PHA-repressed (Example 7) 
procedures described above were characterized to determine the specificity of their 
20 modulation and time required for induction or repression. Clones were stimulated with 
multiple activators or inhibitors over a one to twenty-four hour time interval. As shown 
in Table 6, five clones produced by the induced and repressed sorting paradigms using a 
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plurality of activators were tested for their responsiveness to a variety of T-cell activators, 
suppressors, and combinations thereof. 



5 TABLE 6 

Sorting protocols and specificity of activated BLEC Jurkat clones 



1 Clone 


Sorting Procedures 


Relative Beta-Lactamase Activity of the Close c** the Indicated Sbmaius 
After 24 bo on (V* of maximum activated famuli) 


1 


Paradigm 


First Son 
Stimulus 
and 

(cell cotor 
sorted for) 


Second ; Third Sort 
Son ; S Ursulas 
Stimulus f And 
tad ] (cell color 
(cHI co*or i sorted for) 
sorted 
for/ 

i 


None 


PMA 
(InM) 


Thans 

(l*M) 

/ 


PMA 

fUM)* 

Thaps 

<UM> 


PMA 

(InM)* 

Toaps 

CiA 

(too 

■M) 


PHA (10 i PHA 
t^ml) j (10^/m) 

| CsA 
1 (100 

i nM) 

■ 


J8J-PI 1 ; 

',. 1 


induced 

1 1 


PHA* 
(blue. 


N-'S 

(treem 


PHA 
(blue) 


0 


<l 


100 1 30 

1 


1 ! 




Induced 


PHA (oluel ! N/b ] PHA 
| (preen » < (bluet 


0 


60 


1-2 


100 


70 | 80 75 


' C2 j N/.S 


N/$ 


N/S 


N/S 


o 


<l 


0 i 100 

I 


<i 


30 , i 


J38v- 
PT14 


InuuccJ 


PMA b 
■t- 

Thaps' 
(blue) 


N/S : PMA * 
tpeen) \ Thaps 
| (blue) 


0 


00 


5 


35 


100 


85 


<*) 


J83 07- 
PPTR2 


Rtorettcd 


N/S 
(blue) 


PMA ! N/S 
j (blue) 

Thaps 

(green) | 


0 


100 


85 


•50 


85 




75 


J83- \ tnouccd 

ms 1 


PHA (blucj 


N/S i PHA 
(preen) ! (blue) 


0 


80 


100 


2: j 70 


60 i 60 
1 



"N/S" means "no stimulation" 



' concentration of PHA used was IC^g/ml. 
1 0 c concentration of PMA used was 8 nM 

1 concentration of Thaps used was U*M 



In this study, PMA, which is a PKC activator, Thapsigargm which increases 
intracellular calcium, PHA which activates the T-cell receptor pathway, and cyclosporin 
15 A which is a clinically approved immunosuppressant that inhibits the Ca 2 * dependent 
phosphates calcineunn were investigated for their ability to modulate beta-lactamase 
expression in PHA induced and repressed BLEC clones. 

The selected clones show varied dependence for their activation and inhibition by 
these activators and inhibitors which give and indication of the signaling events required 
20 for their transcriptional activation. Five of the listed clones were generated using the 
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approaches described above in Example 6. The clone C2 was generated using a more 
classical approach. This clone was generated by transfecting a plasmid construct in which 
a 3X NFAT response element has been operably linked to beta-lactamase expression. 
This 3XNF AT element represents a DNA sequence that is present in the promoter region 
of IL-2 and other T-cell activated genes. In addition the C2 cell line has been stably 
transfected with the Ml muscurinic receptor. This allows the activation of beta-lactamase 
expression in this clone using an Ml -muscurinic agonist such as carbachol. This cell line 
therefore represents a good control for the cellular activators and inhibitors tested as the 
signaling events required for its activation are established. 

The results of these studies indicate that the cell lines generated vary in their 
specificity towards activation or repression by activators. Thus, depending on the type of 
system that these cells are to be used to investigate, a panel of clones with varying 
specificity towards a specific pathway are made available by the present methods- 
Table 7 and Table 8 provide data similar to that provided in Table 5 for all of the 
clones obtained by the methods of Examples 5 to 7. 



TABLE 7 



Characterization of induced BLEC Jurkat clones 







Chaajt in 460/530 ratio in the Indicated clour 
by the following activator 


CLONE 
Number 


TIME 
<hour») 
Tor first 
detectable 
changr tn 
color 


PHA 

(l<y/ml) 

/ 


Thapi 


PMA 
(8 rM| 


PMA 
(8 iM) 
♦ Thipi 

<1/M) 


AMI-CD3 
(Fhannlngen) 


J325B5 


6 


7 


Nt 


2-3 


Nt 


4-5 


J325BM 


6 l 9 


1-2 


2-3 Nl 


5-6 


J325E3 


6 i 7 


Nt 


2-3 


Nt 


4-5 


J325G4 


6 3-4 


Nl 


3-4 


Ni 


4-5 


J325E6 


6 ! II 


Nt 


3-4 


Nt 


6 


J326C9 


6 ; 4-5 


1-2 


2-3 


Nt 1 


3-4 


J325EI 


<2 


8 


Nt 


* 


Nt 


5-6 


J326D4 


<2 


10 


0 


10 


Nt 


5-6 


J326D7 


<2 


10 


Nl 


10 


Nt 


^ 


J326F7 


<2 


10 


Nt 


10 


Nt 


5-6 


J326H4 


<2 


10 


Nt 


10 


Nt 


5-6 


J83PI1 


Nl 


3-4 


3-4 


3-4 


4-f 


2*3 


J83P12 


5-6 


8 


1-2 


7-8 


7-8 


3*4 


J83PI8 


5-6 


4-5 | L2 


4-5 


4-5 


2-3 


J83P13 


5-6 


5-6 


6-7 


3-4 


5-6 


2-3 


JB3PI4 | 4-0 


3-4 


3-4 


0 


2-3 


2 
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Chaagt in 440/SJO ratio in In* indicated done 
by the following activator 


CLONE 
: Number 

j 

I 


TIME 
(hoars) 
lor firti 
drtrclable 
thanfc in 
cok>r 


Pll A 


Taapi 


PMA 
<8*aM) 


MVIA 

+ Than* 

0*A1) 

/ 


\ *\jwy "»i t 

(Phanniogfft) 


; J83P16 




6-7 


7-JJ 


0 


4-5 


4 


J83PJO 




0 


5-6 


0 


4-5 


3-4 


IBlPtC 


Nt 


Nt 


Nt 


Nt 


Nt 


Nt 


JBJrf / 


6-1 8 


2 


2 


2 


2 


1.5-2 


J83PI J 5 


Nt 


3-4 


2 


3-4 


3-4 


3-4 


1 ft 1 P 1 w. 


N't 


3-4 


1-2 


3-4 


3-4 


2-3 


IB**. PI 1 It 

Jojri J o 


Nt 


5-6 


7-8 


5 


Nt 


H\ 


J83P1 1 2 


Nt 


Nt 


Ni 


Nl 


Nt 


Nl 


1BTPI I A 
JBJrll** 


Nt 


2 


i 2 


2 


Nt 


Nt 


lfllPl 1 7 
JBJrl 1 / 


Nl 


Nl 


\ Nt 


Nt 


Nt 


Nt 


■ 3K3PI19 


Nt 


5-6 


1-2 


3 


1-2 


1-2 


J83P1] 1 


Ni 


Nl 


Hi 


Nt 


N: 


Ni 




Nl 


20 


4o 


0 


Ni 


N1 


J97P1 1 


Nt 


3-4 


3-4 


J-4 


3-4 


3-4 




Nt 


20 


Nt 


Nt 


20 


Nt 


i JV/rlJ 


Nt 


1-2 


1-2 


1-2 


1-2 


Nl 


t J97PI4 





i-2 


1-2 


1-2 


1-2 


Nt 


J97PI5 


N; 


: 5 


1 9 


1.3 


2-3 


Nr 


i jy/rIC 


Nt 


3-4 


4-6 


1-2 


4-6 


Nl 


Jv /rl 1 J 


WW 

N, _. 


20 


5-6 


1-2 


4-5 


Nl 


J v frl I B 




i-2 


3-4 


1-2 


4-5 


Nt 


I07PI7 


Nil 

. N ' ., -J 


3— 


4-5 


1-2 


5-0 


Nt 


jv frl i / 


Nt 


4-5 


7-8 


1-2 


8-10 


Nl 


J97PI8 


_ , 


^TTo 


3-4 


1-2 


3-4 


Nl 


I07PIO 
J V / rl V 




2-3 


4-5 


1-2 


5-6 


Nt 


Jv/rllU 


Kli 

Nl 


3-4 


3-a 1 


1-2 


4-5 


Nt 


J97PE23 


Kir 


4-5 


4.5 


1-2 


4-5 


1-2 


J97PI1 I 


Ni 


3-4 


5-6 


2 


4-5 


Nt 1 


JV frl Ji 


pit 


1-2 


3-4 


1-2 


3-4 


Nt j 


107DI 1 1 


Nt 


3-* 


5-6 


2-3 


5-6 


Nt 


J97P122 


Nt 


5-6 


5*7 


2-3 


3-4 


3-4 




Kir 


4-5 


3-4 


2 


4-5 


Nt ! 


JV frl \ I 0 


Nt 


2-3 


3-4 


20 


4 


N; [ 


jv /rl ] v 


Nt 


20 


20 


1-2 


2-4 


Nt 


107PI7.fl 
JV frliV 


KJt 

nt 


1-2 


20 


1-2 


1-2 


Ni 


IG7P17 1 
jv frl* 1 


nt 


20 


2-3 


1-2 


2-3 


2-3 


J v Ir\£4 


KJ< 

nt 


3-4 


3-4 


2-3 


7-10 1 


3-4 


J389PTI 




5-6 


3-4 


8-9 


8-9 | 


3-4 


J389PT4 


1 Hout 


15 


10 


12 


1 


15 


J389PM2 


i hour 


4-5 


3-4 


3-4 


4-5 


4-5 


J389PM3 


lhour 


3-4 


2-3 


20 


3-4 


3-4 


J389PM5 


lhour 


4-5 


3-4 


3-4 


4-5 


4-5 


J 3 89PM 7 


3hours 


J-2 


20 


1-2 


-L2 I 


1-2 


J3B9PM8 


2-3houn 


2-3 


3-4 


2-3 


20 1 


3-4 


J389TI1 


3-5hx>ur* 


1-2 


2-3 


1-2 


2-3 


2-3 


J389TI4 


2 hour 


0 


3-4 


1-2 


1 


0 



l Nf means "not tested 1 
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TABLE 8 

Characterization of repressed BLEC Jurkat clones 





Kciilivr repression ofbcra-lactamftie in the IntttcstetJ etmnt by the follswiag sctivsior 


CLONE P 


PHA 


rHA 


PMA 


PMA 








<8 nMI) 
+ 


(8nM) 
+ 






CsA 


Thipi 


Thipi 






(100 nM) 














CsA 










(100 AM) 


J83/07pptrl 


90 


90 


75 


75 


J83/97 pptr2 


10 


-60 


10 


-80 


J83/97pptr3 


10 


-50 


10 




J83/97pptr4 


60 


60 


40 


70 


J83/97pptr5 


50 


60 


50 


50 


J83/97pptrt 


70 


70 


70 


70 



To confirm that changes in reporter gene activity reflected changes in mRNA 
expression in these clones, Northern analysis was performed on induced, constitutive, and 
repressed clones using a radio labeled DNA probe directed towards the beta-lactamase 
gene. All clones that had beta-lactamase enzyme inducibility tested showed beta- 
lactamase mRNA inducibility. All clones that showed constitutive expression of beta- 
lactamase showed constitutive expression of beta-lactamase mRNA. All cicnes that 
showed repressed beta-lactamase expression showed repressed beta-lactamase mRNA. 
The message size of the control beta-lactamase mRNA was about 800 base pairs. The 
sizes of some from other^rlactamase clones of the RNA were shifted higher in the gel, 
indicating a fusion RNA had been made between the endogenous transcript and beta- 
lactamase . Two known genes, CDK-6 (isolated from clone J83-PTI1) and Erg-3 
(isolated from clone J89-PTI4), and two unknown genes were identified, which were 
isolated from clones J83PI15 and J83PI2, respectively. For clone J389-PTI4, a Northern 
blot was performed with the Erg-3 probe made using appropriate PCR primers 
determined from a published sequence which hybridizes with both the fusion RNA and 
the wild type RNA (for the sequence of Erg-3 see Stamminger et al., Int. Immunol 5:63- 
70 (1993); for PCR methodologies, see U.S. Patent Nos: 4,800,159, 4,683,195, and 
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4,683,202), The inducibility in wild type Jurkat cells mimicked the beta-lactamasc 
activity in this clone. 



Example 9 Screening of a library of known pharmacologically active modulators 
5 using a T-cell activated BLEC clone 

T-cell clone J32-6D4 was used to identify potential inhibitors of the T-cell 
receptor pathway. This clone was selected for further study because it is difficult to 
identify chemicals that inhibit specific T-cell receptor pathway. Thus, this clone was 
used to identify chemicals that inhibit this T-cell receptor pathway that is also stimulated 

10 by the PKC activator PMA. 

A first screen was performed using a generic set of 480 chemicals with known 
properties. The chemicals in this set were known to have pharmacological activity. 
Approximately one percent (7/480) of these chemicals showed greater than 50% 
inhibition of the PHA activation of beta-lactamase expression in clone J32-6D4 when 

15 tested in duplicate at 10*M of chemical. Cells were activated with ^g/ml of PHA for 18 
hours in the presence of test chemicals to test for inhibitory activity. The seven chemicals 
that specifically inhibited clone J32-6D4 are shown in Table 9. Two of these chemicals 
specifically inhibited clone J32-6D4 and not the control C2 cell line. This assay for the 
specificity of inhibition included screening these 480 chemicals for inhibitory activity 

20 using clone C2, in which the Ml muscarinic receptor was linked to a NFAT beta- 
lactamase reporter gene readout (see Example 7). In these experiments, the inhibition 
measured was the inhibition of carbachol induced expression of beta-lactamase. These 
results, the specific inhibition of J32-6D4 cells but not C2 cells, show that the chemicals 
are not toxic, do not inhibit general transcription, and do not inhibit the reporter gene 

25 product. 



TABLE 9 

30 Active chemicals identified as exhibiting inhibitory activity of PHA activation of clone J32- 

6D4 
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Chemical 


% Inhibition of 
PHA ictivation of 
Clone J32-6D4 


Inhibition of 
Clone C2 


I nerapeuiic 
Category of the 
Lnciuicii 


Digoxm 


86 


+ 


Cardiotonic 


Digitoxm 


77 , + 


Cardiotonic 


Gentian 
Violet 


73 | ♦ 


Topical anti-infective 


Oxyphenbuta 
zone 


!" 


Anti-inflammatory 


Mechloretha 
mine 


51 i 

r 


Ann-neoplasbc 


Dipynthione 


70 ] + 


Ann -bacterial 


Ouabain 


50 | + 


Cardiotonic 


Thioguanmc 


50 | + 


Ami -neoplastic 



Example 10 Screening a library of structurally characterized chemicals having 
unknown pharmacological properties for modulating activity of the T-cell receptor 
5 pathway using a T-cell activated BLEC clone 

Having demonstrated in Example 9 that clone J32-6D4 performs robustly in a 
chemical screen, this clone was used to screen an additional 7,500 chemicals from a 
proprietary chemical library at a concentration of 10^M per chemical. This collection of 
chemicals, unlike the collection of chemicals used in Example 9, contains chemicals 

10 without known pharmacological activity. Seventy-seven chemicals showed at least 50% 
inhibition of PHA activation of beta-lactamase expression following the general 
procedures set forth in Example 7. These 77 chemicals were re-tested for this activity 
using the same procedure and 31 chemicals were confirmed to have activity. The IC50 
values of the inhibition of PHA activation of beta-lactamase expression were determined 

15 for these 3 1 chemicals using concentrations of chemical between about 20 / *aM to 2 nM. 
IC50 values reflect the concentration of a chemical needed to inhibit the PHA activation 
of the clone by 50% and were determined using known methods. These 3 1 chemicals 
were also tested for their cross inhibition of cabachol mduced activation of beta- 
lactamase expression of clone C2 as described in Example 8. 

20 Two chemicals, designated chemical A and chemical B, exhibited an IC50 values 

of about 200 nM and specifically inhibited the PHA activation of beta-lactamase 
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expression of clone J32-6D4 but not the carbachol activation of clone C2 at the 
concentration tested. All of the other 31 chemicals either inhibited both clone J32-6D4 
and clone C2, or had IC50 values above y*M. 

H 3 C <^s^^ ^\X^/ CH 3 




H 3 C CH 3 

10 Chemical A Chemical B 

Chemicals A and B were further tested for their anti -proliferative effect on Jurkat 
cells and mouse L-cells (mouse fibroblast cell line). Chemical B showed no anti- 
proliferative effect on both the Jurkats and L-cells at concentrations up to lO^M. 

15 Chemical A exhibited an antiproliferative effect on the Jurkats and L-cells at 100 nM. 
Proliferation assays were performed by seeding about 20,000 cells unactivated by PHA 
into a 24 well plate. These cells were contacted with chemicals and were then incubated 
at 37°C for five days. The cells were contacted with 10^g/ml of MTT (Sigma Chemical 
Co., MO) for three hours. The cells were then collected, resuspended in isopropanol, and 

20 the absorbance was read in a plate reader at a wavelength of 570 nM with a background 
subtraction at a reading at a wavelength of 690 nM (see, Carmichael et al., Cancer Res. 
47:936 (1987)). 



Example 11 Effects of identified chemicals on primary human T-cell proliferation. 

25 An assay was developed to test the chemicals identified in Example 9 for their 

ability to inhibit the activation and proliferation of normal peripheral white blood cells to 
confirm their presumptive activity (see generally, Harlow and Lane, Antibodies. A 
Laboratory Manual , Cold Spring Harbor Press, (1988)). Peripheral blood from normal 
humans was drawn into heparanized Vacutainer® tubes and incubated with various 

30 concentrations of (superantigen) staphylococcal enterotoxin B (SEB, at 0.001 to 10 

ng/ml) for 1 hour a: 37°C. Brefeldin A, which was added and the cells were incubated an 
additional 5 hours. EDTA was added to detach the cells, and a 100^ aliquot was 
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removed, the red blood cells lysed with ammonium chloride, the remaining cells counted 
and their viability determined using viability staining using known methods. The red 
blood cells remaining in the original sample were lysed with ammonium chloride and the 
remaining cells (leukocytes) were permeabilized with FACS permabilizing solution using 
established methods. These leukocytes were harvested by centrifugation, washed and 
stained with the combination of antibodies CD69, IFN-^and CD3, which were detectably 
labeled. Control cells consisted of cells incubated in the absence of SEB and staining 
control cells consisted of cells stained with CD69/MsIgGl and CD3 antibodies, which 
were detectably labeled. Similar cultures will be incubated for 71 hours, pulsed with 
tndiated thymidine for 1 hour and harvested and the incorporated radioactivity counted 
by scintillation to determine a stimulation index using established methods. 

Using preferred concentrations of SEB, various concentrations of cyclosporin A 
(CsA) were added to determine optimal conditions of CsA for blocking of SEB 
stimulation of peripheral blood T-cells for use as a control for non-proliferative T-cells. 
Controls consisted of cells incubated with culture media in place of CsA. Control 
cultures incubated for I hour were blocked with Brefeldm A for an additional 5 hours, 
harvested, and stained for intracellular IFN-£(or cultured for an additional 71 hours, 
pulsed with tntiated thymidine for one hour, harvested, and counted by liquid 
scintillation. 

Using preferred concentrations of SEB and CsA, blood from normal donors was 
stimulated in the presence and absence of CsA. This established expected normal ranges 
for the degree of activation (% activated CD3+ cells for 6 hours), proliferation 

( 3 H-TdR uptake at 72 hours) and CsA blocking at both time points. 

Using preferred conditions, human blood was incubated with Chemical A or 
Chemical B at 2, 20, and 200 nM. CsA was used as a positive control for T-cell 
suppression. One hour cultures were blocked with Brefeldin A for an additional 5 hours, 
harvested and counted by liquid scintillation. Cell counts and percent viability were 
reported for each culture condition. 
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The results of these studies should demonstrate that at least one of the chemicals 
identified by the methods of the present invention have the predicted pharmacological 
activity in human cells. 

Example 12: Identification of genes expressed during developmental programs. 

Another use of this method is for the identification of genes expressed during 
various cellular processes, such as developmental biology and apoptosis. Genes involved 
in specific developmental programs, such as the differentiation of pre-adiposites to 
mature adiposites, can be identified using this method. 

In order to practice this method, a clone library from a pre-adiposite cell line such 
as 3T3-L1 is made using the methods generally described in Examples 10 to 12 above. 
Of course, pre-adiposite cells are used rather than Jurkat cells. This cell line can be 
reversible differentiated to mature adiposites by exposing them to dexamethasone and 
indomethasone (see. Hunt et aL Proc. Natl. Acad. Sci. U.S.A. 83:3786-3789 (1986)). 
These mature adiposites can be revcrsibly differentiated to pre-adiposites with Tumor 
Necrosis Factor alpha TNFa (see, Torti et al. J. Cell. Biol. 108:1105-1113 (1989)). Thus, 
a cell library capable of signaling the expression of genes involved in cellular 
differentiation can be made. 

The 3T3-L1 gene trap library is FACS sorted to remove blue constitutively 
expressing beta-lactamase cells. The remaining green cells are then differentiated into 
mature adiposites using the dexamethasone and indomethasone. Blue (beta-lactamase 
expressing) cells are isolated using FACS. These clones represent cells in which the 
trapping construct integrates into a gene that is expressed in differentiated adiposites, but 
not in undifferentiated adiposites. This process can be repeated multiple times to insure 
enrichment for cells that express adiposite specific genes. 

Alternatively, cell clones can be isolated which are differentiated for a specific 
time interval. For instance, blue and green cells differentiated for 2 days with 
dexamethasone and indomethasone are sorted. These populations of cells represent cells 
in which the trapping construct integrates into a gene that is expressed early in the 
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differentiation process. This allows the identification of genes that are expressed during 
the developmental program but are not expressed in pre-adiposites or mature adiposites. 
This method can be used to isolated genes expressed during a variety of developmental 
programs, including but not limited to neuronal, cardiac, muscle, and cancer cells. 
5 These cells lines can be used to identify genes involved in the differentiation 

process, and can also be used to screen chemicals that modulate the differentiation 
process using the methods described in Examples 8 to 10 above. Drugs that can be 
identified include those that enhance the growth of cells, such as neuronal cells, or 
depress the growth or reverse differentiation of cells, such as cancer cells. 



Example 13: Assays for modulators of G-protein coupled receptors 

The general procedures of Examples 8 to 10 can be used in an analogous manner 
to identify cell lines suitable for screens for G-protein coupled receptors (GPCRs). 

1 5 GPCRs are known to signal via one of several intracellular pathways. These pathways 
can be activated pharmacologically in cell libraries to yield potential screening cell lines. 
For example, Gq coupled GPCRs are known to raise intracellular free calcium via 
activation of phospholipase Cb (PLCb). By isolating cell lines responsive to an increase 
in calcium from the genomic library (e.g. induced by ionomycin or thapsigargm), screen 

20 cell lines are generated. 

For example, a calcium-sensitive clone was transfected with a Gq-type GPCR by 
electroporation. Cells from clone J389PTI4 were transfected by electroporation with a 
plasmid (pcDNA3 (Invitrogen) or pcDNA3-Ml (pcDNA3 that can operably express Ml 
receptor) to make cell lines J389PTI4/pcDNA3 and J389PT14/pcDNA3-Ml). Cell line 

25 J389PTI4/pcDNA3-Ml expressed the Ml receptor, whereas the cell line 

J389PTI4/pcDNA3 did not. Thus, the J389PTI4/pcDNA3 ceil is a control cell. Two days 
after transfection, cells were stimulated with 20^M carbachol in 96-well microtiter plate 
for 6 hours in 37 °C. These cells were contacted with CCF-2 dye for another 90 minutes. 
The 460/530 ratio changes were measured in a Cytoflour (Senes 4000 Model) (Perceptive 

30 Biosystems) fluorescence plate reader and correspond to reporter gene expression. These 
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results are summarized in Table 10. The ability of the transiently-transfectcd clone to 
detect a ligand for the GPCR demonstrates the potential of generating screening cell lines 
using clones made following the procedures of the present invention. The stimulation by 
carbachol detected in the transient tranfection assay represents a response in about 20% of 

5 the cells. To develop a stable screening cell line for the Ml receptor, this population can 
be sorted for individual clones responsive to carbachol and those clones can be expanded 
and screened to identify the most responsive clones. 

Similar methods can be used to generate cell lines for Gs or Gi-coupled receptors. 
In these cases, clones responsive to increases or decreases in cAMP can be isolated. A 

10 vanety of cell lines can be used for these procedures, such as CHO, HEK293, 
Neuroblastoma, PI 9, Fl I, and NT-2 cells. 



TABLE 10 

Cell lines that report modulation of the Ml receptor pathway 



| Relative expression of beta-lactamase in cells 
Exposed to the indicated stimuli 


Cell Line 


Unstimulated 


30*M Carbachol 


lOnMPHA 


J389PTI4/pcDNA3 


i i i 


12 


J389PTI4/pcDNA3-Ml 11 I 4 


13 
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SEQUENCE ID. LISTING 

SEQ.ID NO. 1: range 1 to 795 









10 






20 






3C 






4C 




50 






• 




* 


* 




* 




* 






* 












ATG 


AGT 


CAC 


CCA 


GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


GCT 


GAA GAT 


CAG 


TTG 




Met 


Ser 


His 


Pre 


Glu 


Thr 


Leu 


val 


Lys 


Val 


Lys 


Asp 


Ala 


Glu Asp 


Gin 


Leu 


10 






60 








70 






so 






90 


100 






» 






* 




• 






* 




* 


* 






* 




GGT 


GCA 


CGA 


GTG 


GGT 


TAC 


ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT AAG 


ATC 


CTT 




Gly 


Ala 


Arg 


Val 


Gly 


Tyr 


He 


Glu 


Leu 


Asp 


Leu 


Asn 


Ser 


Gly Lys 


He 


Leu 






110 






120 






130 




140 




150 




!5 


* 








* 


* 




♦ 






* 




* 


* 








GAG 


AGT 


TTT 


CGC 


ccc 


GAA 


GAA 


CGT 


TTT 


CCA 


ATG 


ATG 


AGC 


ACT TTT 


AAA 


GTT 




Glu 


Ser 


Phe 


Arg 


Pro 


Glu 


Glu 


Arg 


Phe 


Pro 


Met 


Met 


Ser 


Thr Phe 


Lys 


Val 






160 




17C 






180 






190 


200 






* 






* 




» 




* 


# 




* 




* 




* 




20 


CTG 


CTA 


TGT 


GGC 


GCG 


GTA 


TTA 


TCC 


CGT 


GTT 


GAC 


GCC 


GGC 


CAA GAG 


CAA 


CTC 




Leu 


Leu 


Cys 


Gly 


Ala 


Val 


Leu 


Ser 


Arg 


val 


Asp 


Ala 


Gly Gin Glu 


Gin 


Leu 






210 






220 




230 






240 




250 






* 


V 




» 










* 




* 


* 




* 


* 


* 




GGT 


CGC 


CGC 


ATA 


CAC 


TAT 


TCT 


GAG 


AAT 


GAC 


TTG 


GTT 


GAG 


TAC TCA 


CCA 


GTC 


25 


Gly 


Arg 


Arg 


He 


His 


Tyr 


Ser 


Glr. 


Asn 


Asp 


Leu 


Val 


Glu 


Tyr Ser 


Pro 


val 




260 






270 






280 




290 




300 










* 






* 




* 




• 


* 




* 




* * 




* 




ACA 


GAA 


AAG 


CAT 


CTT 


ACG 


GAT 


GGC 


ATG 


ACA 


GTA 


AG A 


GAA 


TTA TGC 


AGT 


GCT 




Thr 


Glu 


Lys 


His 


Leu 


Thr 


Asp 


Gly 


Met 


Thr 


val 


Arg 


Glu 


Leu Cys 


Ser 


Ala 


30 


310 




320 






330 






340 




350 










♦ 






* 




* 


* 




* 




* 








# 




GCC 


ATA 


ACC 


ATG 


AGT 


GAT 


AAC 


ACT 


GCG 


GCC 


AAC 


TTA 


CTT 


CTG ACA 


ACG 


ATC 




Ala 


lie 


Thr 


Met 


Ser 


Asp 


Asr. 


Thr 


Ala 


Ala 


Asn 


Leu 


Leu 


Leu Thr 


Thr 


He 




360 






370 




380 






390 






400 






35 


# 








* 


• 




* 




* 


» 




* 


* 


* 






GGA 


GGA 


CCG 


AAG 


GAG 


CTA 


ACC 


GCT 


TTT 


TTG 


CAC 


AAC 


ATG 


GGG GAT 


CAT 


GTA 




Gly 


Gly 


Pro 


Lys 


Glu 


Leu 


Thr 


Ala 


Phe 


Leu 


His 


Asn 


Met 


Gly Asp 


His 


Val 




410 






42C 




430 






440 






450 








* 






• 




• 




• 


• 










» 


• 




40 


ACT 


CGC 


CTT 


GAT 


CGT 


TGG 


GAA 


CCG 


GAG 


CTG 


AAT 


GAA 


GCC 


ATA CCA 


AAC 


GAC 




Thr 


Arg 


Leu 


Asp 


Arg 


Trp 


Glu 


Pro 


GlU 


Leu 


Asn 


Glu 


Ala 


He Pro 


Asn 


Asp 




460 






47C 




480 






490 






500 




51 






* 




* 




• 


* 




* 




♦ 








* 


* 




GAG 


CGT 


GAC 


ACC 


ACG 


ATG 


CCT 


GCA 


GCA 


ATG 


GCA 


ACA 


ACG 


TTG CGC 


AAA 


CTA 


45 


Glu 


Arg 


Asp 


Thr 


Thr 


Met 


Pro 


Ala 


Ala 


Met 


Ala 


Thr 


Thr 


Leu Arg 


Lys 


Leu 






520 






530 




540 






550 




560 








» 




* 


* 




# 




* 


* 




* 




* • 








TTA 


ACT 


GGC 


GAA 


CTA 


CTT 


ACT 


CTA 


GCT 


TCC 


CGG 


CAA 


CAA 


TTA ATA 


GAC 


TGG 




Leu 


Thr 


Gly 


Glu 


Leu 


Leu 


Thr 


Leu 


Ala 


Ser 


Arg 


Gin 


Gin 


Leu He 


Asp 


Trp 


50 


570 






580 




590 




600 




61C 








* 


* 




* 




# 


* 








* 


• 


• 








ATG 


GAG 


GCG 


GAT 


AAA 


GTT 


GCA 


GGA 


CCA 


CTT 


CTG 


CGC 


TCG 


GCC CTT 


CCG 


GCT 
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Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro Ala 
620 63C 640 650 660 

• « * # * * * * « • 

GGC TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT CGC GGT 

5 Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser Arg Gly 
670 680 690 700 710 

ATC ATT GCA GCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC GTA GTT ATC 

He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He Val Val He 
10 720 730 740 750 760 

« * * * * * * # * * * 

TAC ACG ACG GGG AGT CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT 

Tyr Thr Thr Gly Ser Gin Ala Thr Met Aap Glu Arg Asn Arg Gin lie Ala 

770 780 790 



15 



GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
Glu lie Gly Ala Ser Leu He Lys His Trp 



20 

SEC. ID NO. 2; range 1 to B5 8 



40 







10 






20 






30 






40 






50 










* 




* 




* 


* 




* 




* 


* 




* 


ATG 


AGA 


ATT 


CAA 


CAT 


TTC 


CGT 


GTC 


GCC 


CTT 


ATT 


CCC 


TTT 


TTT 


GCG 


GCA 


TTT 


Met 


Arg 


He 


Gin 


His 


Phe 


Arg 


Val 


Ala 


Leu 


lie 


Pro 


Phe 


Phe 


Ala 


Ala 


Phe 






60 






70 






80 






90 






100 




* 


• 




• 




* 






* 






* 




« 






TGC 


CTT 


CCT 


GTT 


TTT 


GGT 


CAC 


CCA 


GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


GCT 


Cys 


Leu 


Pro 


Val 


Phe 


Gly 


His 


Pro 


Glu 


Thr 


Leu 


Val 


Lys 


Val 


Lys 


Asp 


Ala 






110 






120 




130 






140 






150 


# 




« 




* 


* 




• 






* 








♦ 






GAA 


GAT 


CAG 


TTG 


GGT 


GCA 


CGA 


GTG 


GGT 


TAC 


ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


Glu 


Asp 


Gin 


Leu 


Gly 


Ala 


Arg 


val 


Gly Tyr 


He 


Glu 


Leu 


Asp 


Leu 


Asn 


Ser 




160 






170 






180 




190 






200 








* 


• 




* 






* 




* 




• 


• 




* 




GGT 


AAG 


ATC 


CTT 


GAG 


AGT 


TTT 


CGC 


CCC 


GAA 


GAA 


CGT 


TTT 


CCA 


ATG 


ATG 


AGC 


Gly 


Lys 


He 


Leu 


Glu 


Ser 


Phe 


Arg 


Pro 


Glu 


Glu 


Arg 


Phe 


Pro 


Met 


Met 


Ser 




210 




220 






230 






240 




250 




* 


* 




* 






* 




* 










» 






• 


ACT 




AAA 


GTT 


CTG 


CTA 


TGT 


GGC 


GCG 


GTA 


TTA 


TCC 


CGT 


GTT 


GAC 


GCC 


GGG 


Thr 


Phe 


Lys 


Val 


Leu 


Leu 


Cys 


Gly 


Ala 


Val 


Leu 


Ser 


Arg 


val 


Asp 


Ala 


Gly 




260 






270 




2B0 






290 






300 






* 






* 








* 


* 








* 


• 






CAA 


GAG 


CAA 


CTC 


GGT 


CGC 


CGC 


ATA 


CAC 


TAT 


TCT 


CAG 


AAT 


GAC 


TTG 


GTT 


GAG 


Gin 


Glu* 


Gin 


Leu 


Gly 


Arg 


Arg 


He 


His 


Tyr 


Ser 


Gin 


Asn 


Asp 


Leu 


Val 


Glu 


310 






320 






330 




340 






350 








• 


* 




• 






* 




* 




* 


* 




♦ 




* 


TAC 


TCA 


CCA 


GTC 


ACA 


GAA 


AAG 


CAT 


CTT 


ACG 


GAT 


GGC 


ATG 


ACA 


GTA 


AGA 


GAA 


Tyr 


Ser 


Pro 


Val 


Tfar 


Glu 


Lys 


His 


Leu 


Thr 


Asp 


Gly 


Met 


Thr 


Val 


Arg 


Glu 


360 




370 






3B0 






390 




40C 






• 




* 




* 


* 




« 




* 


# 




* 










TTA 


TGC 


AGT 


GCT 


GCC 


ATA 


ACC 


ATG 


AGT 


GAT 


AAC 


ACT 


GCG 


GCC 


AAC 


TTA 


CTT 
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25 



35 



Leu 


Cys 


Ser 


Ala 


Ala 


He 


Thr 


Met 


Ser 


Asp 


Asn 


Thr 


Ala 


Ala 


Asn 


Leu 


Leu 


410 






420 




430 






440 






45C 










* 


* 




* 




* 


* 




* 




* 


» 




• 




CTG 


ACA 


ACG 


ATC 


GGA 


GGA 


CCG 


AAG 


GAG 


CTA 


ACC 


GCT 


TTT 


TTG 


CAC 


AAC 


ATG 


Leu 


Thr 


Thr 


lie 


Gly 


Gly 


Pro 


Lys 


Glu 


Leu 


Thr 


Ala 


Phe 


Leu 


His 


Asn 


Met 


460 






47G 






480 




490 






500 






511 




• 




# 




• 


• 




* 














m 


* 


GGG 


GAT 


CAT 


GTA 


ACT 


CGC 


CTT 


GAT 


CGT 


TGG 


GAA 


CCG 


GAG 


CTG 


AAT 


GAA 


GCC 


Gly 


Asp 


Hi B 


Val 


Thr 


Arg 


Leu 


Asp 


Arg 


Trp 


Glu 


Pro 


Glu 


Leu 


Asn 


Glu 


Ala 






520 






530 






540 




550 






560 




* 




* 


* 




* 






* 




* 




• 


* 




* 


ATA 


CCA 


AAC 


GAC 


GAG 


CGT 


GAC 


ACC 


ACG 


ATG 


CCT 


GCA 


GCA 


ATG 


GCA 


ACA 


ACG 


He 


Pro 


Asn 


Asp 


Glu 


Arg 


Asp 


Thr 


Thr 


Met 


Pro 


Ala 


Ala 


Met 


Ala 


Thr 


Thr 






570 




560 






590 






600 




610 




* 


♦ 




* 




•* 






* 




• 


* 




* 




* 


TTG 


CGC 


AAA 


CTA 


TTA 


ACT 


GGC 


GAA 


CTA 


CTT 


ACT 


CTA 


GCT 


TCC 


CGG 


CAA 


CAA 


Leu 


Arc 


Lys 


Leu 


Leu 


Thr 


Gly 


Glu 


Leu 


Leu 


Thr 


Leu 


Ala 


Ser 


Arg 


Gin 


Gin 






620 






630 




640 






650 






660 


« 




* 










* 




* 






* 




* 


* 




TTA 


ATA 


GAC 


TGG 


ATG 


GAG 


GCG 


GAT 


AAA 


GTT 


GCA 


GGA 


CCA 


CTT 


CTG 


CGC 


TCG 


Leu 


lie 


Asp 


Trp 


Met 


Glu 


Ala 


Asp 


Lys 


Val 


Ala 


Gly 


Pro 


Leu 


Leu 


Arg 


Ser 




670 






6B0 






690 




700 






710 




* 




• 


* 




* 




* 


• 




* 




* 


* 




* 




GCC 


CTT 


CCG 


GCT 


GGC 


TGG 




ATT 


GCT 


GAT 


AAA 


TCT 


GGA 


GCC 


GGT 


GAG 


CGT 


Ala 


Leu 


Pro 


Ala 


Gly 


Trp 


Phe 


He 


Ala 


Asp 


Lys 


Ser 


Gly 


Ala 


Gly 


Glu 


Arg 




720 




730 






740 






750 




760 




* 


* 




* 




# 






* 




* 






* 




* 


* 


GGG 


TCT 


CGC 


GGT 


ATC 


ATT 


GCA 


GCA 


CTG 


GGG 


CCA 


GAT 


GGT 


AAG 


CCC 


TCC 


CGT 


Gly 


Ser 


Arg 


Gly 


He 


lie 


Ala 


Ala 


Leu 


Gly 


Pro 


Asp 


Gly 


Lys 


Pro 


Ser 


Arg 




770 






7B0 




790 






800 






810 




















* 


• 




* 




■* 


• 




* 


ATC 


GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


CAG 


GCA 


ACT 


ATG 


GAT 


GAA 


CGA 


AAT 


lie 


val 


Val 


He 


Tyr 


Thr 


Thr 


Gly 


Ser 


Gin 


Ala 


Thr 


Met 


Asp 


Glu 


Arg 


Asn 


820 






83C 






84C 




850 














» 


* 




* 




• 


* 




• 






* 










AGA 


CAG 


ATC 


GCT 


GAG 


ATA 


GGT 


GCC 


TCA 


CTG 


ATT 


AAG 


CAT 


TGG 








Arg 


Gin 


lie 


Ala 


Glu 


He 


Gly 


Ala 


Ser 


Leu 


He 


Lys 


His 


Trp 









40 

SEC. ID NO. 3: range 1 to 795 



45 AAGCTTTTTGCAGAAGCTCAGAATAAACGCAACTTTCCGGGTACCACC 

1C 20 30 40 50 



ATG GGG CAC CCA GAA ACG CTG GTG AAA GTA AAA GAT GCT GAA GAT CAG TTG GGT 
50 GCA 

Met Gly His Pro Glu Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly 
Ala 

60 70 80 90 100 
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CGA 


GTG 


GGT 


TAC 


ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT 


AAG 


ATC 


CTT 


GAG 


AGT 




Arq 


Val 


Glv 


Tyr 


lie 


Glu 


Leu 


Asp 


Leu 


Asn 


Ser 


Gly 


Lys 


lie 


Leu 


Glu 


Ser 




110 






120 




130 






140 






150 






< 

> 






« 


* 




* 




• 


* 




* 




• 


* 




* 






T7™ 


CGC 


CCC 


GAA 


GAA 


CGT 


TTT 


CCA 


ATG 


ATG 


AGC 


ACT 


TTT 


AAA 


GTT 


CTG 


CTA 




Phe 


Arg 


Pro 


Glu 


Glu 


Arn 


Phe 


Pro 


Met 


Met 


Ser 


Thr 


Phe 


Lys 


Val 


Leu 


Leu 




1 6 C 






170 




180 






190 






200 




210 










* 




* 


* 




* 




* 


* 




* 






* 


in 


TGT 


GGC 


GCG 


GTA 


II A 


TCC 


CGT 


uA 1 


GAC 


GCC 


GGG 


CAA 


GAG 


CAA 




GGT 


CGC 




Cvs 


Gl V 


Ala 


Val 


Leu 


Ser 


Arg 


lie 


ASD 


Ala 


Gly 


Gin 


Glu 


Gin 


Leu 


Gly 


Arg 








220 






230 






240 




250 






260 










• 


• 




* 




* 


* 




# 




* 






* 




CGC 


ATA 


CAC 


TAT 


TCT 


CAG 


AAT 




TTG 


GTT 


GAG 


TAC 


TCA 


CCA 


GTC 


ACA 


GAA 




Arg 


lie 


His 


Tyr 


Ser 


Gin 


Asn 


Asp 


Leu 


Val 


Glu 


TVT 
* 


Ser 


Pro 


Val 


Thr 


Glu 








270 




280 






290 






300 




310 








* 








* 


• 




* 




♦ 


* 




• 




* 






CAT 


CTT 


ACG 


GAT 


GGC 


ATG 


ACA 


GTA 


AGA 


GAA 


TTA 


TGC 


AGT 


GCT 


GCC 


ATA 




Lys 


His 


Leu 


Thr 


Asp 


Gly 


Met 


Thr 


val 


Arg 


Glu 


Leu 


Cys 


Ser 


Ala 


Ala 


He 


20 






320 






330 




340 






350 






360 












* 


* 




•# 




* 






* 




* 


* 






ACC 


ATG 


AGT 


GAT 


AAC 


ACT 


GCG 


GCC 


AAC 


TTA 


CTT 


CTG 


ACA 


ACG 


ATC 


GGA 


GGA 




Thr 


Met 


Ser 


ASp 


Asn 


Thr 


Ala 


Ala 


Asn 


Leu 


Leu 


Leu 


Thr 


Thr 


He 


Gly 


oi y 






370 






380 






390 




400 






410 










* 






* 




♦ 


♦ 




• 




* 


• 










CCG 


AAG 


GAG 


CTA 


ACC 


GCT 


TTT 


TTG 


CAC 


AAC 


ATG 


GGG 


GAT 


CAT 


GTA 


ACT 


CGC 




Pro 


Lys 


Glu 


Leu 


Thr 


Ala 


Phe 


Leu 


His 


Asn 


Met 


Gly Asp 


His 


Val 


Thr 


Arg 






420 




430 






440 






450 




460 






w 


* 




* 




* 


* 




* 




* 


* 




» 




* 


• 




CTT 


GAT 


CAT 


TGG 


GAA 


CCG 


GAG 


CTG 


AAT 


GAA 


GCC 


ATA 


CCA 


AAC 


GAC 


GAG 


CGT 




Leu 


Asp 


His 


Trp 


Glu 


Pro 


Glu 


Leu 


Asn 


Giu 


Ala 


lie 


Pro 


Asn 


Asp 


Glu 


Arg 






47G 






4 8C 




490 






50C 






510 












* 


* 




• 




* 


* 




* 




* 


* 




* 




GAC 


ACC 


ACG 


ATG 


CCT 


GTA 


GCA 


ATG 


GCA 


ACA 


ACG 


TTG 


CGC 


AAA 


CTA 


TTA 


ACT 


35 


Asp 


Thr 


Thr 


Met 


Pro 


Val 


Ala 


Met 


Ala 


Thr 


Thr 


Leu 


Arg 


Lys 


Leu 


Leu 


Thr 




520 






530 






540 




550 






560 
















* 




• 






* 




* 


* 




* 




* 




GGC 


GAA 


CTA 


CTT 


ACT 


CTA 


GCT 


TCC 


CGG 


CAA 


CAA 


TTA 


ATA 


GAC 


TGG 


ATG 


GAG 




Gly 


Glu 


Leu 


Leu 


Thr 


Leu 


Ala 


Ser 


Arg 


Gin 


Gin 


Leu 


lie 


Asp 


Trp 


Met 


Glu 


40 


570 




580 






590 






600 




610 








* 




* 




• 


* 




* 




* 






* 




* 


* 






GCG 


GAT 


AAA 


GTT 


GCA 


GGA 


CCA 


CTT 


CTG 


CGC 


TCG 


GCC 


CTT 


CCG 


GCT 


GGC 


TGG 




Ala 


Asp 


Lys 


Val 


Ala 


Gly 


Pro 


Leu 


Leu 


Arg 


Ser 


Ala 


Leu 


Pro 


Ala 


Gly 


Trp 




620 






630 




640 






650 






660 






45 


* 




* 


» 




* 




* 


* 




* 




• 


* 




* 






TTT 


ATT 


GCT 


GAT 


AAA 


TCT 


GGA 


GCC 


GGT 


GAG 


CGT 


GGG 


TCT 


CGC 


GGT 


ATC 


ATT 




Phe 


lie 


Ala 


ASp 


Lys 


Ser 


Gly 


Ala 


Gly 


Glu 


Arg 


Gly 


Ser 


Arg 


Gly 


lie 


He 




670 






660 




690 






700 






710 




720 




* 






* 




• 






* 




• 






* 




* 


« 


50 


GCA 


GCA 


CTG 


GGG 


CCA 


GAT 


GGT 


AAG 


CCC 


TCC 


CGT 


ATC 


GTA 


GTT 


ATC 


TAC 


ACG 




Ala 


Ala 


Leu 


Gly 


Pro 


Asp 


Gly 


Lys 


Pro 


Ser 


Arg 


lie 


Val 


Val 


He 


Tyr 


Thr 










730 






740 






750 




760 






770 






* 




* 


* 




•* 




* 


* 




* 




* 


♦ 




* 
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ACG GGG AGT CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT GAG ATA 
Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin He Ala Glu He 
780 790 
* * * * * 

5 GGT GCC TCA CTG ATT AAG CAT TGG 
Gly Ala Ser Leu He Lys His Trp 



10 



SEQ.ID NO. 4: range 1 to 792 



15 10 20 30 40 50 

* * * * * * * * * * 

ATG GAC CCA GAA ACG CTG GTG AAA GTA AAA GAT GCT GAA GAT CAG TTG GGT 
Met Asp Pro Glu Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly 
60 70 60 90 100 

20 »#**♦•**♦» 

GCA CGA GTG GGT TAC ATC GAA CTG GAT CTC AAC AGC GGT AAG ATC CTT GAG 

Ala Arg Val Gly Tyr lie G1l Leu Asp Leu Asn Ser Gly Lys He Leu Glu 
110 120 130 140 150 

♦ * * * * * # * * * 

25 AGT TTT CGC CCC GAA GAA CGT TTT CCA ATG ATG AGC ACT TTT AAA GTT CTG 

Ser Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu 
160 17C 180 190 200 

* * * * * * * * * * 

CTA TGT GGC GCG GTA TTA TCC CGT ATT GAC GCC GGG CAA GAG CAA CTC GGT 

30 Leu Cys Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly 
21C 220 230 240 250 

-»* * * ♦ * * * * * * 

CGC CGC ATA CAC TAT TCT CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC ACA 

Arg Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 

35 260 270 280 290 300 

* * * * * * * * * * 

GAA AAG CAT CTT ACG GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT GCC 

Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala 

310 320 330 340 350 

40 * * * * * * • » * * 

ATA ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC GGA 

He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly 

360 370 380 390 400 

* * ******** 

45 GGA CCG AAG GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA ACT 

Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr 
410 420 430 440 450 

* ** * * * * * * * 

CGC CTT GAT CAT TGG GAA CCG GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG 

50 Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu 

460 470 480 490 500 51C 

** * ** * ** * * • 

CGT GAC ACC ACG ATG CCT GTA GCA ATG GCA ACA ACG TTG CGC AAA CTA TTA 
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Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
520 530 S40 550 560 

* * * * * * * * ♦ • 

ACT GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG ATG 

Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu lie Asp Trp Met 
570 580 590 600 610 

• * * * * * * * » * 

GAG GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG GCT GGC 

Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro Ala Gly 

620 630 640 650 660 

* * ***** * * * 

TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT CGC GGT ATC 

Trp Phe lie Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser Arg Gly lie 

670 6B0 690 700 710 

* * * * *♦ * * * * 

ATT GCA GCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC GTA GTT ATC TAC 

He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He Val Val He Tyr 

720 730 740 750 760 

* * * ** * * * * * * 

ACG ACC GGG AGT CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT GAG 

Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin lie Ala Glu 
770 780 790 

* * * * ♦ 

ATA GGT GCC TCA CTG ATT AAG CAT TGG 

He Gly Ala Ser Leu lie Lys His Trp 



SEQ.ID NO. 5: range 1 to 7 86 



10 20 30 40 50 

* ***** * * » • 

ATG AAA GAT GAT TTT GCA AAA CTT GAG GAA CAA TTT GAT GCA AAA CTC GGG 

Met Lys Asp Asp Phe Ala Lys Leu Glu Glu Gin Phe Asp Ala Lys Leu Gly 
60 70 80 90 100 
* * * * * ♦ ** * * 

ATC TTT GCA TTG GAT ACA GGT ACA AAC CGG ACG GTA GCG TAT CGG CCG GAT 

He Phe Ala Leu Asp Thr Gly Thr Asn Arg Thr Val Ala Tyr Arg Pro Asp 
110 120 130 140 150 
* * ** *** * * * 

GAG CGT TTT GCT TTT GCT TCG ACG ATT AAG GCT TTA ACT GTA GGC GTG CTT 

Glu Arg Phe Ala Phe Ala Ser Thr He Lys Ala Leu Thr Val Gly Val Leu 
160 170 180 190 200 
* ********* 

TTG CAA CAG AAA TCA ATA GAA GAT CTG AAC CAG AGA ATA ACA TAT ACA CGT 

Leu Gin Gin Lys Ser He Glu Asp Leu Asn Gin Arg He Thr Tyr Thr Arg 

210 220 230 240 250 

* * •* •• * * * * ** 

GAT GAT CTT GTA AAC TAC AAC CCG ATT ACG GAA AAG CAC GTT GAT ACG GGA 

Asp Asp Leu Val Asn Tyr Asn Pro He Thr Glu Lys His Val Asp Thr Gly 
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260 






270 




280 






290 






300 








* 




* 


* 




# 




• 


* 




* 




* 






* 




ATG 


ACG 




AAA 


GAG 


CTT 


GCG 


GAT 


GCT 


TCG 


CTT 


CGA 


TAT 


AGT 


GAC 


AAT 


GCG 




Met 


Thr 


Leu 


Lys 


Glu 


Leu 


Ala 


Asp 


Ala 


Ser 


Leu 


Arg 


Tyr 


Ser 


Asp 


Asn 


Ala 


5 


310 






320 






330 




340 






350 
















• 






* 




# 




* 


• 




• 




• 




GCA 


CAG 


AAT 


CTC 


ATT 


CTT 


AAA 


CAA 


ATT 


GGC 


GGA 


CCT 


GAA 


AGT 


TTG 


AAA 


AAG 




Ala 


Glr. 


Asn 


Leu 


He 


Leu 


Lys 


Gin 


He 


Gly 


Gly 


Pro 


Glu 


Ser 


Leu 


Lys 


Lys 




360 




370 






380 






390 




400 






10 


* 




* 




* 


• 




• 




* 


# 




* 




* 


* 






GAA 


CTG 


AGG 


AAG 


ATT 


GGT 


GAT 


GAG 


GTT 


ACA 


AAT 


CCC 


GAA 


CGA 


TTC 


GAA 


CCA 




Glu 


Leu 


Arg 


Lys 


lie 


Gly 


Asp 


Glu 


Val 


Thr 


Asn 


Pro 


Glu 


Arg 


Phe 


Glu 


Pro 




410 






420 




43C 






440 






4S0 












• 


♦ 




• 




* 


* 




* 




* 


• 




* 




15 


GAG 


TTA 


AAT 


GAA 


GTG 


AAT 


CCG 


GGT 


GAA 


ACT 


CAG 


GAT 


ACC 


AGT 


ACA 


GCA 


AGA 




Glu 


Leu 


Asn 


Glu 


Val 


Asn 


Pro 


Gly 


Glu 


Thr 


Gin 


Asp 


Thr 


Ser 


Thr 


Ala 


Arg 




460 






470 






4B0 




4 90 






500 






51( 






* 




• 




* 


• 




• 




* 


* 




* 




♦ 


* 




GCA 


CTT 


GTC 


ACA 


AGC 


CTT 


CGA 


GCC 


TTT 


GCT 


CTT 


GAA 


GAT 


AAA 


CTT 


CCA 


AGT 


20 


Ala 


Leu 


Val 


Thr 


Ser 


Leu 


Arg 


Ala 


Phe 


Ala 


Leu 


Glu 


Asp 


Lys 


Leu 


Pro 


Ser 








520 






530 






540 




550 






560 






* 










* 










* 






* 




* 




GAA 


AAA 


CGC 


GAG 


CTT 


TTA 


ATC 


GAT 


TGG 


ATG 


AAA 


CGA 


AAT 


ACC 


ACT 


GGA 


GAC 




Glu 


Lys 


Arg 


Glu 


Leu 


Leu 


He 


Asp 


Trp 


Met 


Lys 


Arg 


Asn 


Thr 


Thr Gly Asp 


25 






570 




580 






590 






600 






610 






* 


* 




• 




• 


* 








* 


* 




• 




♦ 




GCC 


TTA 


ATC 


CGT 


GCC 


GGA 


GCG 


GCA 


TCA 


TAT 


GGA 


ACC 


CGG 


AAT 


GAC 


ATT 


GCC 




Ala 


Leu 


He 


Arg 


Ala 


Gly 


Val 


Pro 


Asp 


Gly 


Trp 


Glu 


Val 


Ala 


Asp 


Lys 


Thr 








620 






63C 




640 






650 






660 


30 






* 




* 


* 




* 




* 


* 




* 




* 


* 






ATC 


ATT 


TGG 


CCG 


CCA 


AAA 


GGA 


GAT 


CCT 


GTC 


GGT 


GTG 


CCG 


GAC 


GGT 


TGG 


GAA 




Gly 


Ala 


Ala 


Ser 


Tyr 


Lys 


Gly 


Asp 


Pro 


val 


Gly 


Thr 


Arg 


Asn 


Asp 


He 


Ala 






670 






680 






690 




70C 






710 






* 




* 


• 




• 




• 


• 








• 


♦ 




• 




j j 


GTG 


GCT 


GAT 


AAA 


ACT 


GTT 


CTT 


GCA 


GTA 


TTA 


TCC 


AGC 


AGG 


GAT 


AAA 


AAG 


GAC 




He 


He 


Trp 


Pre 


Pro 


Val 


Leu 


Ala 


Val 


Leu 


Ser 


Ser 


Arg 


Asp 


Lys 


Lys 


Asp 






720 




730 






740 






750 




760 






* 


* 




* 




• 


♦ 




* 




* 


* 




♦ 




• 


• 




GCC 


AAG 


TAT 


GAT 


GAT 


AAA 


CTT 


ATT 


GCA 


GAG 


GCA 


ACA 


AAG 


GTG 


GTA 


ATG 


AAA 


40 


Ala 


Lys 


Tyr 


Asp 


Asp 


Lys 


Leu 


He 


Ala 


Glu 


Ala 


Thr 


Lys 


Val 


Val 


Met 


Lys 






770 






780 
































* 






* 
























GCC 


TTA 


AAC 


ATG 


AAC 


GGC 


AAA 
























Ala 


Leu 


Asn 


Met 


Asn 


Gly 


Ly6 























45 
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We claim: 

1 A method for identifying proteins or chemicals that directly or indirectly modulate 
a genomic polynucleotide comprising: 

providing a^-lactamase integrated into a non-yeast, eukaryotic genome contained 
in at least one living cell, 

contacting said cell with a predetermined concentration of a modulator, and 
detecting BL activity from said cell. 

2. The method of claim 1, wherein said detecting further comprises measuring 
cleavage of a membrane perm cant BL substrate, wherein said membrane permeant BL 
substrate is transformed in said cell. 

3. The method of claim 2, wherein said membrane permeant BL substrate comprises 
a donor and acceptor. 

4. The method of claim 3, wherein said detecting further comprises measuring FRET 
between said donor and said acceptor. 

5. The method of claim 2, wherein said living cell is a mammalian cell. 

6. The method of claim 5 ( wherein said BL expression construct randomly integrates 
into said genome. 

7. The method of claim 6, wherein said living cell is contacted with said modulator 
prior to inserting of said BL expression construct in said non-yeast, eukaryotic genome 
and further comprising the step of determining the coding nucleic acid sequence of a 
polynucleotide operably linked to said BL expression construct, wherein said construct 
comprises a splice donor, a splice acceptor and an IRES element. 

8. The method of claim 5, wherein said BL expression construct encodes cytosolic 
BL and said cell comprises a receptor that is known to bind said modulator. 

9. The method of claim 8, wherein said receptor is a nuclear receptor heterologously 
expressed by said cell. 

10. The method of claim 8, wherein said receptor has a transmembrane domain and is 
homologously expressed by said cell. 

1 1. The method of claim 10, wherein said modulator is a non-peptide. 
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12. The method of claim S, wherein said cell is contacted with a predetermined 
concentration of a second modulator and detecting BL activity before and after contacting 
said cell with said second modulator. 

13. The method of claim 5, wherein said cell comprises an orphan protein 
5 heterologously expressed by said cell. 

14. The method of claim 5, wherein said BL activity is increased in the presence of 
said modulator compared with the^lacatamase activity in the absence of said modulator. 
15 The method of claim 5, wherein said modulator is known to bind to a receptor 
expressed by said cell and said BL activity in said cell is increased in the presence of said 

10 modulator compared to the BL activity detected from a corresponding cell in the presence 
of said modulator, wherein said corresponding cell does not express of said receptor. 

16. A method of identifying active genomic polynucleotides, comprising: 
contacting living cells with a membrane permeant BL substrate, and 
sorting living cells by fluorescence, 

15 wherein said cells are eukaryotic cells and comprise a genome having a stably 

integrated BL expression construct and said fluorescence indicates BL activity. 

17. The method of claim 16, wherein said sorting further comprises measuring 
cleavage of a membrane permeant BL substrate by fluorescence spectroscopy in a FACS, 
wherein said membrane permeant BL substrate is transformed in said cell. 

20 18. The method of claim 17, wherein said membrane permeant substrate BL substrate 
has a donor and acceptor and said measuring further comprises measuring FRET between 
a donor and an acceptor. 

19. The method of claim 1 7, wherein said sorting further comprises separating said 
cells without BL activity from said cells with BL activity. 
25 20. The method of claim 19, wherein said cells are contacted with only a cell culture 
medium in the absence of a test chemical. 

21. The method of claim 20, wherein said cells without BL activity are contacted with 
a test chemical and further sorted by fluorescence for BL activity. 

22. The method of claim 21, wherein said test chemical is an agonist. 
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23. The method of claim 21, wherein said test chemical is an antagonist. 

24. The method of claim 21, wherein said cells without BL activity are contacted with 
a test chemical and further sorted by fluorescence for BL activity. 

25. The method of claim 22, wherein said cells with BL activity are contacted with an 
5 antagonist and further sorted by fluorescence for BL activity. 

26. The method of claim 17, wherein said cells express an identified receptor that 
binds a modulator known to bind to said identified receptor. 

27. The method of claim 26, wherein said living cells comprise a heterologous G- 
protein. 

10 28. The method of claim 17, wherein said living cells comprise a heterologous protein 
having a membrane domain. 

29. A composition of matter comprising a non-yeast, eukaryotic cell having a genome 
with a stably integrated BL expression construct comprising a polynucleotide encoding a 
protein having BL activity, an IRES element, a splice donor site and a splice acceptor 

15 site. 

30. The composition of matter of claim 29, further comprising a heterologous protein 
expressed in said cell. 

3 1 . The composition of matter of claim 30, wherein said cell is a mammalian cell. 

32. The composition of matter of claim 31, wherein said polynucleotide contains 
20 nucleic acid sequences that are preferred by said mammalian cell for expression. 

33. The composition of matter of claim 32, wherein said cell further comprises a 
membrane permeant BL substrate, wherein said membrane permeant BL substrate is 
transformed inside said cell by intracellular esterases. 

34. The composition of matter of claim 33, wherein said polynucleotide encodes a 
25 cytosolic BL. 

35. A method of screening compounds with an active genomic polynucleotide, 
comprising: 

1) optionally contacting a multiclonal population of cells with a first test chemical 
prior to separating said cells by a FACS, 
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2) separating by a FACS said multiclonal population of cells into BL expressing 
cells and non-BL expressing cells, wherein said BL expressing cells have a detectable 
difference in cellular fluorescence properties compared to non-BL expressing cells, and 

Ai) contacting said non-BL expressing cells with a second test chemical, and 
5 Aii) sorting by a FACS said non-BL expressing cells into a) second test chemical 

activated cells and b) second test chemical non-activated cells, 

wherein said second test chemical activated cells have BL activity detectable by a 
FACS and said second test chemical non-activated cells have no BL activity detectable 
by FACS, or 

10 Bi) contacting said BL expressing cells with a third test chemical, and 

Bii) sorting by a FACS said BL expressing cells into a) third test chemical 
activated cells and b) third test chemical non-activated cells, 
wherein said third lest chemical activated cells have BL activity detectable by a 
FACS and said third test chemical non-activated cells have no BL activity detectable by 

15 FACS, 

wherein said multiclonal population of cells comprises eukaryotic cells having a 
BL expression construct integrated into a genome of said eukaryotic cells and a 
membrane permanent BL substrate transformed inside said cells to a membrane 
impermeant BL substrate. 
20 36. The method of claim 35, wherein said BL activity is measured by FRJET. 

37. The method of claim 35, wherein said steps of Ai and Aii or Bi and Bii are 
repeated. 

38. The method of claim 35, wherein said second test chemical activated cells are 
washed, then contacted with a modulator in the presence of said second test chemical and 

25 tested for BL activity. 

39. The method of claim 38, wherein said modulator is present in a concentration of 
I0y^4 or less. 

40. The method of claim 35, wherein said eukaryotic cells express a heterologous 
protein. 
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41. A method for identifying an expressed protein that directly or indirectly 
modulates a genomic polynucleotide, comprising: providing at least one living non- 
yeast, eukaryotic cell comprising a^lactamase polynucleotide that can be under 
transcriptional control of said at least one living non-yeast, eukaryotic cell's genome and 
f stably integrated into a genomic polynucleotide site, contacting said cell with a 

predetermined concentration of a known modulator, and detecting P-iactamase activity 
from said at least one living non-yeast, eukaryotic cell; wherein said at least one living 
non-yeast, eukaryotic cell expresses a heterologous protein and said known modulator 
increases or decreases the expression of said^-lactamase polynucleotide in the presence of 
10 said heterologous protein. 

42 The method of claim 4 1 , wherein said detecting further comprises measuring 
cleavage of a membrane permeant P-lactamase substrate, wherein said membrane 
permeant P-lactamase substrate is transformed in said at least one living non-yeast, 
eukaryotic cell. 

15 43 The method of claim 42, wherein said membrane permeant P-lactamase substrate 
has a donor and acceptor in said at least one living non-yeast, eukaryotic cell. 
44 The method of claim 3, wherein said method further comprises sorting a 
population of cells with a FACS. 

45. The method of claim 41, wherein said cell is a mammalian cell. 
20 46 The method of claim 45, wherein said^-lactamase polynucleotide includes a P- 
lactamase expression construct for random integration into said genome. 
47. The method of claim 46, further comprising the step of determining a portion of 
the coding nucleic acid sequence of a polynucleotide operably linked to said P-lactamase 
expression construct 

25 48. The method of claim 45, wherein said BL expression construct comprises 

cytosolic P-lactamase, said construct comprises a splice donor, a splice acceptor and an 
IRES element and said cell comprises a receptor that is known to bind said known 
modulator 
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49. The method of claim 45, wherein said heterologous protein is selected from the 
group consisting of hormone receptors, intracellular receptors, receptors of the cytokine 
superfamily, G-protein coupled receptors, heterologous G-proteins, neurotransmitter 
receptors, and tyrosine kinase receptors. 
5 50. The method of claim 45, wherein said hetereologous protein has a transmembrane 
domain. 

5 1 . The method of claim 50, further comprising over expressing said heterologous 
protein. 

52. The method of claim 45, wherein said at least one living non-yeast, eukaryotic 

10 ceil is contacted with a predetermined concentration of a second modulator and detecting 
3-lactamase activity after contacting said cell with said known modulator. 

53. The method of claim 45, wherein said cell comprises an orphan protein 
heterologously expressed by said at least one living non-yeast, eukaryotic cell. 

54. The method of claim 45, wherein said p-lactamase activity is increased in the 
15 presence of said modulator compared to the absence of said modulator. 

55. The method of claim 45, wherein said known modulator is known to bind to a 
receptor and said fj-lactamase activity in said at least one living non-yeast, eukaryotic 
cell is increased in the presence of said modulator compared to the (^-lactamase activity 
detected from a corresponding cell in the presence of said known modulator, wherein said 

20 corresponding cell does not express said heterologous protein. 

56. A method for identifying modulators, comprising: 

a) contacting at least one living mammalian cell with a test chemical at a 
predetermined concentration and a known modulator at a predetermined 
concentration, wherein said at least one living mammalian cell comprises a£r 

25 lactamase polynucleotide that can be under transcriptional control of said at least 

one living mammalian cell's genome and stably integrated into a genomic 
polynucleotide site, and 

b) detecting expression of said ^-lactamase polynucleotide by said at least one 
living mammalian cell, wherein said known modulator increases or decreases 
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expression of said ^lactamase polynucleotide located at said genomic 
polynucleotide site. 

57. The method of claim 56, wherein said test chemical changes expression of sai^f- 
iactamase polynucleotide by said known modulator. 
5 58. The method of claim 56, wherein said^ lactamase polynucleotide further 
comprises a splice acceptor site. 

59. The method of claim 58, wherein said ^-lactamase polynucleotide further 
comprises an IRES. 

60. The method of claim 57, wherein said test chemical or known modulator is 
10 provided at a concentration less than about l|iM. 

6 1 . The method of claim 56, further comprising separating a population of living 
mammalian cells into 1) a population of living mammalian cells that expresses^- 
lactamase and 2) a population of living mammalian cells that does not express^- 
lactamase. 

15 62. The method of claim 6 1 , wherein said separating further comprises measuring 
cleavage of a membrane permeant^lactamase substrate in said population of living 
mammalian cells by fluorescence spectroscopy in a FACS, wherein the fluorescence of 
said membrane permeant£r lactamase substrate is transformed by^-lactamase in at least 
one living mammalian cell. 

20 63. The method of claim 56, wherein said known modulator modulates a receptor 
selected from the group consisting of intracellular receptors and G-protein coupled 
receptors. 

64. The method of claim 63, wherein said known modulator is an agonist. 

65. The method of claim 63, wherein said known modulator is an antagonist. 

25 66. The method of claim 64, wherein said known modulator is contacted with said at 
least one living mammalian cell prior to contacting said test chemical with said at least 
one living mammalian cell, 

67, The method of claim 56, wherein said test chemical is a modulator for a protein 
selected from the group consisting of hormone receptors, intracellular receptors, receptors 
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of the cytokine superfamily, G-protcin coupled receptors, heterologous G-proteins, 
neurotransmitter receptors, and tyrosine kinase receptors. 

68. The method of claim 56, wherein said at least one living mammalian cell further 
comprises a heterologously expressed protein selected from the group consisting of 
hormone receptors, intracellular receptors, signaling molecules, receptors of the cytokine 
superfamily, G-protein coupled receptors, heterologous G-proteins, neurotransmitters, 
and tyrosine kinase receptors. 

69. The method of claim 68, wherein said heterologously expressed protein is a G- 
protem coupled receptor or a heterologous G-protein. 

70. The method of claim 56, further comprising the step of activating said at least one 
living mammalian cell with a G-protein coupled receptor modulator. 

7 1 . The method of claim 70, wherein said at least one living mammalian cell further 
comprises an orphan receptor. 

72. The method of claim 56, wherein said at least one living mammalian cell is of cell 
type from a panel of different cell types and steps (a) and (b) are performed on each cell 
type. 

73. The method of claim 56, wherein said genomic polynucleotide site is part of a 
gene not known to be modulated by said known modulator. 

74. The method of claim 73, wherein said known modulator is as an agonist. 

75. The method of claim 74, wherein said test chemical is an antagonist. 

76. The method of claim 73, wherein said known modulator is an antagonist. 

77. The method of claim 76, wherein said test chemical is an agonist. 

78. A method for identifying a modulator, comprising: 

a) contacting a population of non-yeast, eukaryotic cells with a test chemical and 
a known modulator, wherein said population of non-yeast, eukaryotic cells 
comprises a genome with a stably integrated&4actamase expression construct, 
comprising: 

1 ) a polynucleotide encoding a protein havingft-lactamase activity, and 

2) a splice acceptor site; and 
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b) detecting the activity of said£-lactamasc polynucleotide expressed by said 
population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a protein 
having£-Iactamase activity, and said known modulator modulates a biological 
5 process or target. 

79 The method of claim 78, wherein said ^lactamase expression construct further 

comprises a splice donor site. 

80. The method of claim 79, wherein said^- lactamase expression construct further 
comprises an IRES element, 
to 81. The method of claim 78, wherein said population of non-yeast, eukaryotic cells 
further comprises an expressed heterologous G-protein coupled receptor. 

82. The method of claim 81 , wherein said population of non-yeast, eukaryotic cells 
further comprises an orphan G-protein coupled receptor. 

83. A method for identifying a ligand of a target, comprising: contacting a eukaryotic 
15 cell with a test chemical at a predetermined concentration, wherein said eukaryotic ceil 

comprises 1) a genomic polynucleotide with a P-lactamase expression construct under 
expression control by a first polynucleotide in said genomic polynucleotide and 2) a 
target that does not normally modulate transcription of a gene product under expression 
control of said first polynucleotide with proviso that said target can directly or indirectly 
20 alter expression of said P-lactamase expression construct under expression control by said 
first polynucleotide, and 

determining expression of said P-lactamase expression construct. 

84. The method of claim 83, wherein said eurkaryotic cell is a mammalian cell. 

85. The method of claim 84, wherein said target is a heterologously expressed protein. 
25 86. The method of claim 85, wherein said heterologously expressed protein is a 

membrane protein. 

87. The method of claim 84, wherein said heterologously expressed protein is a GPCR. 

88. The method of claim 84, wherein said heterologously expressed protein is an ion 
channel. 
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89. The method of claim 84, further comprising contacting a eukaryotic cell with a test 
chemical at a predetermined concentration, wherein said eukaryotic cell comprises 1 ) a 
genomic polynucleotide with a p-lactamase expression construct under expression control 
by a first polynucleotide in said genomic polynucleotide and 2) a target thai does not 

5 normally modulate transcription of a gene product under expression control of said first 
polynucleotide. 

90. The method of claim 84, wherein said gene product is normally expressed in a first 
tissue and said target is normally expressed in a second tissue, wherein said first tissue is 
of a different embryonic origin than said second tissue. 

10 91 The method of claim 84, wherein said gene product is normally expressed in a first 
cell in vivo and said target is normally expressed in a second cell in vivo, wherein said 
first cell is a different cell type than said second cell. 

92. The method of claim 84, wherein expression of said gene product is normally 
repressed and said target does not increase expression of said gene product in vivo in 

1 5 naturally occurring cells. 

93. The method of claim 84, wherein said gene product is normally expressed in a first 
cell in vivo and said target is normally expressed in a second cell in vivo, wherein said 
first cell is a different cell type than said second cell. 

94. The method of claim 84, wherein expression of said gene product in said eurkaryotic 
20 cell is not detectable in the absence of said target and said eurkaryotic cell does not 

express detectable levels of protein of said target in the absence of heterologous 
expression of said target. 

95. The method of claim 84, wherein native protein of said gene product and native 
protein of said target are not expressed in detectable levels in a single, naturally occurring- 

25 cell. 

96. The method of claim 84, wherein native protein of said target in a naturally occurring 
cell does not modulate expression of native protein of said gene product in said naturally 
occurring cell. 

97. A method for identifying a cellular function of an orphan protein, comprising; 
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contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) an orphan protein, 
5 determining expression of said P-lactamase expression construct, and 

identifying the function of said genomic polynucleotide with said p-lactamase 
expression construct or its corresponding gene where said P- lactamase expression 
construct has integrated. 

98. The method of claim 97, wherein said eukaryotic cell is a mammalian cell. 
10 99. The method of claim 98, wherein said orphan is a heterologously expressed protein. 

100. The method of claim 99, wherein said heterologously expressed orphan protein 
has putative transmembrane domain. 

101. The method of claim 98, wherein said heterologously expressed orphan protein is 
homologous to a GPCR of known function and is overexpressed. 

15 102. A method for identifying a modulator of an orphan protein, comprising: 

contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) a orphan protein that modulates expression of said p-lactamase 
20 expression construct, and 

determining expression of said P-lactamase expression construct. 

103. The method of claim 102, wherein said eukaryotic cell is a mammalian cell. 

1 04. The method of claim 103, wherein said orphan protein is a heterologously 
expressed protein. 

25 105. The method of claim 102, wherein said heterologously expressed orphan protein 
has putative transmembrane domain. 

106. The method of claim 102, wherein said heterologously expressed orphan protein 
is over expressed and is homologous to a GPCR of known function. 

107. A method for identifying intracellular pathways, comprising: 
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expressing a protein of interest in a plurality of eukaryolic cells, wherein each 
eukaryotic cell comprises a genomic polynucleotide with a P-lactamase expression 
construct under expression control by a polynucleotide in said genomic polynucleotide, 
and said plurality of cells has a plurality of integration sites where said P-lactamase 
expression construct has integrated into said genome of each said eukaryotic cell, 

optionally contacting said plurality of eukaryotic cells with a ligand of said 
protein of interest, 

determining expression from said P-lactamase expression construct, and 

identifying said polynucleotide if said expressing of said protein of interest alters 
expression from said 3-lactamase expression construct or if said contacting said ligand of 
said protein of interest alters expression from said P-lactamase expression construct, 

wherein alteration of said expression from said P-lactamase expression construct 
indicates participation of said protein of interest in an intracellular signaling pathway. 

108. The method of claim 107, wherein said eukaryotic cell is a mammalian cell 

1 09. The method of claim 1 08, wherein said protein of interest is a heterologously 
expressed protein and has a known ligand. 

1 10. The method of claim 108, wherein said protein of interest is a heterologously 
expressed protein and has no known ligand. 

111. The method of claim 109, further comprising isolating a eurkaryotic cell from said 
plurality of eukaryotic cells and characterizing said polynucieonde. 

1 12. The method of claim 109, wherein each said eurkaryotic cell in said plurality of 
eukaryotic cells is an isolated, clonal population of cells. 

113. The method of claim 112, wherein said said plurality of cells comprises at least 
10,000 isolated clonal populations of cells. 

114. A method for determining a cellular response profile for a target, comprising: 
expressing a protein of interest in a plurality of eukaryotic cells, wherein each 

eukaryotic cell comprises a genomic polynucleotide with a P-lactamase expression 
construct under expression control by a polynucleotide in said genomic polynucleotide, 
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and said plurality of cells has a plurality of integration sites where said [^lactamase 
expression construct has integrated into said genome of each said eukaryotic cell, 

optionally contacting said plurality of eukaryotic cells with a ligand of said 
protein of interest, 

5 determining expression from said P-Iactamase expression constructs, and 

identifying plurality of said polynucleotides exhibiting a increase, decrease or no 
change in expression from said (3-lactamase expression that results from either said 
expressing of said protein of interest or said contacting of said ligand, 

wherein an increase, decrease or no change in expression of each said 
10 polynucleotide from said plurality of polynucleotides indicates a profile of cellular 
response relating to said protein of interest. 

115. A method for determining a cellular response profile for a chemical, comprising: 
expressing a protein of interest in a plurality of eukaryotic cells, wherein each 

eukaryotic cell comprises a genomic polynucleotide with a [^lactamase expression 
1 5 construct under expression control by a polynucleotide in said genomic polynucleotide, 
and said plurality of cells has a plurality of integration sites where said (i-lactamase 
expression construct has integrated into said genome of each said eukaryotic cell, 

optionally contacting said plurality of eukaryotic cells with a ligand of said 
protein of interest, 

20 contacting said plurality of eukaryotic cells with a test chemical at a 

predetermined concentration, and 

determining expression from said (^-lactamase expression constructs, and 
identifying plurality of said polynucleotides exhibiting a increase, decrease or no 
change in expression from said ^-lactamase expression that results from either said 
25 expressing of said protein of interest or said contacting of said ligand, 

wherein an increase, decrease or no change in expression of each said 
polynucleotide from said plurality of polynucleotides indicates a profile of cellular 
response relating to said test chemical. 

116. A method for identifying a modulator of a viral component, comprising: 
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contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises ! ) a genomic polynucleotide with a p-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) a viral component is not previously known to modulate 
? transcription of a gene product under expression control of said first polynucleotide and 
said viral component is not an oncogene orproto-oncogene or protein product thereof, 
and 

determining expression of said P-lactamase expression construct. 

1 17. The method of claim 116, wherein said viral component is selected from the list 
10 consisting of a virus, a capsule, a viral polynucleotide, or a viral protein. 

118. The method of claim 1 1 7, further comprising contacting a second eukaryotic cell 
with said test chemical at a predetermined concentration, wherein said eukaryotic cell 
comprises 1 ) a second genomic polynucleotide with a P-lactamase expression construct 
under expression control by a second polynucleotide in said second genomic 

15 polynucleotide and 2) said viral component, and 

determining expression of said P-lactamase expression construct, wherein said 
viral component is selected from the list consisting of a virus, a capsule, a viral 
polynucleotide, or a viral protein. 

1 19. The method of claim 1 1 8, wherein said second eukaryotic cell is from a 
20 population of eukaryotic cells, each said eukaryotic cell comprising 1) a genomic 

polynucleotide with a P-lactamase expression construct and 2) said viral component. 

120. A method for identifying a cellular function of a viral component, comprising: 
contacting a eukaryotic cell with a viral component at a predetermined 

concentration or expressing a viral component in said eukaryotic cell, wherein said 
25 eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase expression 
construct under expression control by a first polynucleotide in said genomic 
polynucleotide, 

optionally contacting said eucaryotic cell with a second viral component of a virus 
that is different from said virual component, 
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determining expression of said P-lactamase expression construct, and 
identifying the function of said genomic polynucleotide with said p-lactamase 
expression construct or gene where said P-lactamase expression construct has integrated. 

121. A method for identifying a chemical that modulates a physiological response or 
5 cellular pathway, comprising: 

contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide, wherein said cell is characterized as comprising a physiological response 
10 of interest or a cellular pathway of interest, and 

contacting said eukaryotic cell with a signal molecule, and 
determining expression of said P-lactamase expression construct. 

122. The method of claim 121, said signal molecule is a naturally occurring molecule 
that binds to the outside of said eukaryotic cell and said eukaryotic cell is a mammalian 

15 cell. 

123. The method of claim 122, said physiological response occurs in vivo in an cell 
selected from the group consisting of a nerve cell, cardiac cell, epithelial cell, muscle cell, 
endocrine cell, paracrine cell, blood cell, and connective tissue cell. 

124. The method of claim 121, wherein said signal molecule increases expression. 
20 125, The method of claim 124, wherein said polynucleotide has a gene product that 

does not alter said cellular pathway or physiological response. 

126. A chemical identified by any of the above methods for identifying useful 
chemicals. 

127. A method for identifying and developing a drug, comprising: 

25 1 ) contacting a population of non-yeast, eukaryotic ceils with a test chemical and 

a known modulator, wherein said population of non-yeast, eukaryotic cells 
comprises a genome with a stably integTate4#-lactamase expression construct, 
comprising: 

a) a polynucleotide encoding a protein having^-lactamase activity, and 
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b) a splice acceptor site; arid 

2) detecting expression of said ^-lactamase polynucleotide expressed by said 
population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a 
protein having^- lactamase activity, and said known modulator modulates a 
biological process or target, 

3) determining whether said test chemical alters expression of said -lactamase 
polynucleotide, 

4) optionally testing for toxic effects of said test chemical in a cell-based assay, 

5) optionally generating a second test chemical based on the structure-property 
relationships of said test chemical, 

6) optionally determining whether said second test chemical alters expression of 
said j^-lactamase polynucleotide, 

7) testing for toxic effects of said test chemical or said second test chemical in a 
mammal, and 

8) testing for therapeutic effects of said test chemical or said second test 
chemical in a mammal. 

128. A drug chemical identified and developed by the following method, comprising: 

1) contacting a population of non-yeast, eukaryotic cells with a test chemical and 
a known modulator, wherein said population of non-yeast, eukaryotic cells 
comprises a genome with a stably integrated ^lactamase expression construct, 
comprising: 

a) a polynucleotide encoding a protein having ^-lactamase activity, and 

b) a splice acceptor site; and 

2) detecting expression of said ^lactamase polynucleotide expressed by said 
population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a protein 
having ^-lactamase activity, and said known modulator modulates a biological 
process or target. 
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3) determining whether said test chemical alters expression of said ^lactamase 
polynucleotide, 

4) optionally testing for toxic effects of said test chemical in a cell-based assay, 

5) optionally generating a second test chemical based on the structure-property 
5 relationships of said test chemical, 

6) optionally determining whether said second test chemical alters expression of 
saiti^lactamase polynucleotide, 

7) testing for toxic effects of said test chemical or said second test chemical in a 
mammal, and 

10 8) testing for therapeutic effects of said test chemical or said second test 

chemical in a mammal, 

129. The drug of claim 128, wherein said drug can be used to treat a medical condition 
selected from the group consisting of immune response, cardiac disfunctions and disease, 
vascular disfunctions and diseases, neural disfunctions and disease, endocrine 

15 disfunctions and disease, gastro-intestinal disfunctions and disease, obesity, diabetes, 
inflammation disfunctions and disease, cancer and trauma. 

130. A pharmaceutical composition, comprising a therapeutic agent and a 
pharmaceutical^ acceptable carrier. 

131. The pharmaceutical composition of claim 130, said therapeutic agent having the 
20 structure of Chemical A or B and said pharmaceutical^ acceptable carrier is selected for 

treating undesired T-cell activation or an undesired immune response. 



25 
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